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Summary rel 

Alternate assessments for 
special education students in 
the Southwest Region states 



In 2003 the U.S. Department of Educa- 
tion issued regulations allowing states to 
develop alternate standards and assess- 
ments for students with the most sig- 
nificant cognitive disabilities. This study 
reviews and summarizes alternate assess- 
ment policies and practices — and their 
implementation and impact — for the 
most significantly cognitively disabled 
students, across the five states in the 
Southwest Region. 

The No Child Left Behind Act of 2001 was 
the first federal act to require including all 
students in state and district accountabil- 
ity systems. In 2003 the U.S. Department of 
Education issued regulations allowing states 
to develop alternate assessment standards for 
students with the most significant cognitive 
disabilities — and to include some results from 
these assessments in annual school, district, 
and state accountability formulas as long as 
the number of such inclusions did not ex- 
ceed 1 percent of the combined population of 
students taking general and alternate assess- 
ments statewide (U.S. Department of Educa- 
tion, 2003b). The Individuals with Disabilities 
Education Act of 2004 and U.S. Department 
of Education (2006b) regulations issued in 
August 2006 further clarified the require- 
ments for assessing students with the most sig- 
nificant cognitive disabilities. One important 



change was that states now needed to link 
alternate assessment standards to general 
education standards. 

Many states are struggling to identify alter- 
nate content standards, to find curricula that 
address these standards while meeting student 
needs, to locate teachers who can implement 
the curricula, and to ensure that alternate 
standards are demonstrably linked to general 
education standards in accordance with ex- 
pectations set by the No Child Left Behind Act. 
Most also face great challenges developing and 
implementing reliable and valid alternate as- 
sessments that can be implemented efficiently 
and comparably across the state. 

The survey and interviews conducted for this 
study suggest that the Southwest Region states 
have been tracking changes in their curricu- 
lar and assessment focus from functional to 
academic content. State representatives believe 
that changes in policies and practices have 
improved each state’s approach and emphasis, 
though they admit a need for more rigorous 
analysis of these relationships. 

For 2007/08 four of the five Southwest Region 
states will have instituted alternate portfo- 
lio assessments (based on work samples) or 
performance assessments (based on exemplars 
of proficient performance) for students with 
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significant cognitive disabilities. Louisiana, 
New Mexico, and Texas have transitioned 
dramatically. Louisiana has changed from a 
checklist to a multiple-choice measure. New 
Mexico has switched from a checklist based 
wholly on functional achievement to perfor- 
mance-based tasks that are linked to alternate 
achievement standards. Texas has created al- 
ternate achievement standards based on state 
general content standards and is transitioning 
from local choice in testing to a uniform state- 
developed portfolio system with a checklist. 

Given the range of the student cognitive and 
physical disabilities that definitions cover, a 
one-size alternate assessment will not fit all. 



The wide range of skills and tasks targeted 
by alternate assessments creates challenges 
for comparability and for determinations of 
across-the-board technical adequacy. Much 
work is needed to establish alternate assess- 
ments that reflect adequate psychometric 
properties, instructional relevance, valid- 
ity, reliability, and usability. The Southwest 
Region states share many of the same needs. 
Each, however, has its own unique histories, 
values, populations, approaches, resources, 
and constraints — which must be taken into 
account in any attempt to address a particular 
state’s requirements or to study them further. 
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In 2003 the U.S. 
Department of 
Education issued 
regulations allowing 
states to develop 
alternate standards 
and assessments 
for students 
with the most 
significant cognitive 
disabilities. This 
study reviews 
and summarizes 
alternate assessment 
policies and 
practices — and their 
implementation 
and impact — for the 
most significantly 
cognitively disabled 
students, across the 
five states in the 
Southwest Region. 



OVERVIEW 

This study reviews policies and practices related 
to alternate assessment for the most significantly 
cognitively disabled students (see box 1 and ap- 
pendix A for definitions of key terms). It examines 
challenges that states across the nation encounter 
when implementing these policies and practices, 
and it presents information about alternate assess- 
ments in the five states served by the Southwest 
Regional Educational Laboratory: Arkansas, 
Louisiana, New Mexico, Oklahoma, and Texas (see 
appendix B for student demographic information 
for each state). The study’s findings emerge from 
documents, a survey, and interviews, summarized 
across the five states and compared with national 
findings and trends. 

Most students with disabilities participate in 
state and district assessments by taking existing 
assessments with testing accommodations. But 
a small percentage of students have disabilities 
that make their participation in general state and 
district tests impractical, if not impossible. Such 
participation is likely to yield inaccurate measures 
of academic achievement. Alternate assessments 
are intended for students unable to participate in 
state and district assessment systems, even with 
accommodations (Thompson, Johnstone, Thurlow, 
& Altman, 2005; Thurlow & Case, 2004). On De- 
cember 9, 2003, the U.S. Department of Education 
issued regulations that allowed states to use— for 
accountability purposes— alternate assessments 
based on alternate achievement standards for 
students with the most significant cognitive 
disabilities. 

A primary goal of both the No Child Left Behind 
Act of 2001 and the Individuals with Disabilities 
Education Act of 2004 was to include more stu- 
dents with significant cognitive disabilities in state 
and district accountability systems, for 2007/08 
all five Southwest Region states will have ap- 
proached this goal — for example, by moving from 
checklists or assessments based on functional 
achievement to portfolio or performance tasks that 
are linked to alternate standards based on general 
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BOX 1 

Definitions of key terms 

1 percent rule. When measuring 
adequate yearly progress, states and 
school districts have the flexibility 
to count the “proficient” and “ad- 
vanced” scores of students with the 
most significant cognitive disabili- 
ties who take alternate assessments 
based on alternate achievement 
standards — as long as the number of 
scores so counted does not exceed 1 
percent of all students in the grades 
assessed (or about 9 percent of stu- 
dents with disabilities). (U.S. Depart- 
ment of Education, 2003a) 

2 percent rule. When measuring ade- 
quate yearly progress, states and local 
education agencies may count the 
“proficient” and “advanced” scores of 
certain students who take alternate 
assessments even though they are not 
identified as having the most signifi- 
cant cognitive disabilities— as long as 
the number of scores so counted does 
not exceed 2 percent of all students 
in the grades assessed (or about 20 
percent of students with disabilities). 
(U.S. Department of Education, 2007) 

Accommodations. A change in the 
administration of an assessment 
(setting, scheduling, timing, presen- 
tation, response mode) to achieve 
equity, not advantage, that does not 



change the construct to be measured 
or the meaning of the resulting 
scores. Accommodations should 
be identified in the student’s indi- 
vidualized education program or an 
accommodation plan under Section 
504 of the Rehabilitation Act of 1973 
and used regularly during instruction 
and classroom assessment. (Policy to 
Practice Study Group, 2003) 

Adequate yearly progress. A provision 
of the No Child Left Behind legisla- 
tion requiring schools, districts, and 
states to demonstrate on the basis of 
test scores that students are making 
academic progress. Each state was 
required to submit by January 31, 
2003, a specific plan for monitoring 
adequate yearly progress. (Policy to 
Practice Study Group, 2003) 

Alternate assessment. An instrument 
used to gather information on the 
standards-based performance and 
progress of a relatively small popula- 
tion of students who are unable to 
participate in the general assessment 
system, such as those whose disabili- 
ties preclude their valid and reliable 
participation in general assessments. 
(Policy to Practice Study Group, 2003) 

Individualized education program. A 

document that reflects the decisions 
made by an interdisciplinary team, 
including the parent and student 



when appropriate, and identifies the 
abilities and disabilities of a disabled 
student. (Policy to Practice Study 
Group, 2003) 

Peer review. The review of a state’s 
standards and assessment system 
by state practitioners and experts 
to determine whether it meets the 
requirements of the No Child Left 
Behind Act. 

Reliability. The consistency of the 
test instrument; the extent to which 
it is possible to generalize a specific 
behavior observed at a specific time 
by a specific person to observations 
of similar behavior at different times 
or by different behaviors. (Policy to 
Practice Study Group, 2003) 

Technical adequacy. The extent to 
which an assessment meets the 
requirements for validity, reliability, 
accessibility, objectivity and con- 
sistency with nationally recognized 
professional and technical standards. 
Evidence for technical adequacy can 
include information on administra- 
tion, scoring, interpretation, and 
technical data. (U.S. Department of 
Education, 2007) 

Validity. The extent to which a test 
measures what it was designed to 
measure. (Policy to Practice Study 
Group, 2003) 



content standards. But peer review letters from 
the U.S. Department of Education to each South- 
west Region state (U.S. Department of Education, 
2006a) reveal that although most states have done 
a good deal of work to implement the federal re- 
quirements, few have met the technical challenges 
related to implementing alternate assessments that 
are linked to their general education counterparts. 



To provide an overview of the technical and support 
challenges that states face as they build alternate 
assessments for students with the most significant 
cognitive disabilities, and to supply a context for 
the study’s findings, researchers reviewed studies of 
the development, implementation, and validation of 
the assessments and surveyed or interviewed state 
department of education staff (see box 2). 



OVERVIEW 



3 



BOX 2 

Study methods 

To investigate the challenges to de- 
signing and implementing alternate 
assessments across the Southwest Re- 
gion states, the researchers developed 
six research questions: 

1. What challenges are states en- 
countering when implementing 
new alternate assessment policies 
and practices? 

2. What do alternate assessments 
across the Southwest Region 
states look like? 

3. What training or professional 
development is provided for 
teachers on alternate 
assessments? 



4. How are results collected and 
used at the state, district, school, 
and student levels? 

5. To what extent do state alternate 
assessments capture the same 
or similar skills as state tests 
designed for the general student 
population? 

6. What technical issues are 
states facing in developing and 
implementing reliable and valid 
alternate assessments? 

The study replicated the procedures 
used by Browder et al. (2005), a 
mixed-methods approach to re- 
search that includes qualitative and 
quantitative methods. When sys- 
tematically combined, the methods 
provide rigorous, methodologically 



sound investigations in a range of 
fields (Creswell, Fetters, & Ivankova, 
2004). The quantitative data collec- 
tion involved review of state materi- 
als, surveys, and interviews using 
descriptive techniques. The quali- 
tative data collection consisted of 
semistructured surveys and a review 
of documents. 

The researchers organized the data 
into a matrix to compare state poli- 
cies, practices, and procedures. They 
integrated the results of the quantita- 
tive method (survey) with that of the 
qualitative data (review of docu- 
ments) and systematically analyzed 
the data and reported the results. 

(For a fuller account of the study 
methodology, see appendix D.) 



They identified the challenges states are encoun- 
tering when implementing new alternate assess- 
ment policies and practices. They described what 
alternate assessments across the Southwest Region 
states look like and what training or professional 
development is provided for educators on alter- 
nate assessments. They looked at how results were 
being collected and used at the state, district, 
school, and student levels and the extent to which 
states’ alternate assessments capture the same 
or similar skills as state tests designed for the 
general student population. They also considered 
what technical issues states face in developing 
and implementing reliable and valid alternate 
assessments. They summarized current policies 
and practices in the five Southwest Region states 
and connected these practices to what is known 
nationally 

All the Southwest Region states report including 
increasing numbers of students with significant 
cognitive disabilities in state assessments and 
implementing changes in special education cur- 
ricula and instruction (see appendix C). Arkansas 
and Oklahoma, which have achieved full approval 



status according to the federal peer review (U.S. 
Department of Education, 2006a), report that 
they are continuing to improve programming and 
instruction by refining content achievement stan- 
dards to reflect a blend of functional and academic 
skills linked to content standards. 

Louisiana has changed from a checklist to a 
performance-based assessment and will be chang- 
ing to a more traditional testing approach. New 
Mexico has switched from a checklist based wholly 
on functional achievement to performance-based 
tasks that are linked with alternate achievement 
standards. Texas has created alternate achieve- 
ment standards based on state general content 
standards and is transitioning from locally 
selected alternate assessments and an optional 
state-developed alternate assessment to a uniform 
state-developed portfolio system with a checklist. 

A linchpin of the No Child Left Behind Act is the 
need to attend to what gets reported and used for 
accountability. This study revealed disparities 
in the five Southwest Region states’ definitions 
of significant cognitive disability, their criteria 
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for alternate assessment participation, their task 
selection, their scoring, and their reporting. Rep- 
resentatives of all five states openly discussed the 
technical challenges involved (see appendix C). 

As states conduct new alignment studies and 
standard setting activities, they are required to 
demonstrate the alignment of alternate assessment 
achievement standards with grade-level content 
standards and alternate assessments — a challeng- 
ing task (especially when states allow teachers to 
select the content standards being measured or 
the tasks for assessment). Because of changes to 
content standards, achievement standards, and 
assessment approaches in the past several years, 
three states lack continual, year-to-year trend 
data that could be used to measure progress and 
growth. Those three states are exploring how to 
conduct consequential validity studies to examine 
policies’ positive and negative effects. 

Representatives from Louisiana, New Mexico, and 
Texas underscored their states’ needs for addi- 
tional support to fully implement federal require- 
ments. States report that they might benefit from 
technical help as they develop robust strategies for 
collecting and using results at the state, district, 
school, and student levels. Despite the challenges, 
some districts are attempting to support the 
needs of students with disabilities and the current 
federal mandates. Researchers found similar needs 
for further research for all five states. 



WHY THIS STUDY? 



With the No Child 
Left Behind Act the 
movement toward 
alternate assessments 
picked up speed 
and urgency 



Alternate assessments for students with the most 
significant cognitive disabilities are fairly new in 
most states. Before the federal Individuals with 

Disabilities Education Act of 1997 

most students with disabilities 
either were not included in state 
testing or participated inconsis- 
tently (Thurlow, 2004). The 1997 
act required states to include 
students with disabilities in state 
testing, and it boosted the use of 



alternate assessments to assess these students— 
requiring states to create alternate assessments 
by the end of 2000. Besides requiring full and fair 
participation in assessment systems, the 1997 
act underscored that accommodations must be 
provided for students with disabilities and that 
individualized education program teams (IEP 
teams; see box 1) must make determinations about 
participation in alternate assessments according to 
state guidelines. 

The No Child Left Behind Act was the first federal 
act to require including all students in state and 
district accountability systems. A primary goal of 
both it and the Individuals with Disabilities Educa- 
tion Act of 2004 was to include more students with 
significant cognitive disabilities in state assessment 
and accountability systems. With the No Child Left 
Behind Act the movement toward alternate assess- 
ments picked up speed and urgency. It mandated 
not only that students with disabilities be included 
in state assessment programs, but also that they 
count equally with other designated subgroups for 
state and federal accountability. Follow-up regula- 
tions created the 1 percent group, or students with 
the most significant cognitive disabilities who could 
be assessed based on alternate content standards. 

The Individuals with Disabilities Education Act of 
1997 did not mandate how states should develop al- 
ternate assessment policies or procedures. So, states 
created different versions of alternate assessments 
based on their own special education student 
demographics, knowledge of the populations, and 
requirements for technical adequacy in assess- 
ments (Browder, Spooner, Algozzine, et al., 2003; 
Thurlow, 2004). Virtually all states have developed 
alternate achievement standards along with some 
form of alternate assessment and are attempting to 
link these to reform instructional practice. But re- 
cent peer review letters from the U.S. Department 
of Education (a formal review and approval process 
for state standards and assessment systems) reveal 
that although most of the five Southwest Region 
states have done a good deal of work to implement 
federal requirements, few have met the challenges 
to implementing alternate assessments on a par 
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with their general education counterparts (U.S. 
Department of Education, 2006a). 

Referring to the assessment of students with sig- 
nificant cognitive disabilities under the No Child 
Left Behind Act, the U.S. Department of Education 
(2003b) defines an alternate assessment as “an as- 
sessment designed for the small number of students 
with disabilities who are unable to participate in 
the regular grade-level state assessment, even with 
appropriate accommodations.” Nonregulatory guid- 
ance also included the following language: 

To qualify as an assessment under Title I, an 
alternate assessment must be aligned with 
the state’s content standards, must yield re- 
sults separately in both reading and language 
arts and mathematics, and must be designed 
and implemented in a manner that supports 
use of the results as an indicator of adequate 
yearly progress. Alternate assessments can 
measure progress based on alternate achieve- 
ment standards... and can also measure 
proficiency based on grade-level achievement 
standards. Alternate assessments may be 
needed for students who have a broad variety 
of disabilities; consequently, a state may 
employ more than one alternate assessment. 
When used as part of the state assessment 
program, alternate assessments must have an 
explicit structure, guidelines for which stu- 
dents may participate, clearly defined scoring 
criteria and procedures, and a report format 
that communicates student performance in 
terms of the academic achievement standards 
defined by the state (U.S. Department of 
Education, 2005, p. 15). 

The definition and guidance reinforced the idea 
that alternate assessments were appropriate and 
allowable for the full range of the intended special 
education subpopulation. But they gave states 
considerable flexibility in developing alternate as- 
sessment policies, structures, and formats. 

Proponents of including students with significant 
cognitive disabilities as full participants in state 



The academic 
achievement of students 
with the most significant 
cognitive disabilities has 
been underestimated 
by traditional paper- 
and-pencil tests 



assessments determined 
that the change would 
give them a greater voice 
in the education system 
and shift accountability 
to the principle that “all 
means all” (Thurlow, 

2004). The academic 
achievement of students 
with the most significant cognitive disabilities 
has consistently been underestimated by tradi- 
tional paper-and-pencil tests because of students’ 
inability to function in an on-demand environ- 
ment and respond in the limited manner allowed. 
States saw alternate assessments built to address 
the needs of this hard-to-assess population as a 
valid means to increase access and, consequently, 
make schools’ adequate yearly progress results 
more fair. 



In addition, given the traditional underemphasis 
on academic instruction for students with dis- 
abilities as opposed to those with functional skills 
(Kleinert & Kearns, 1999), full inclusion with 
formal accountability requirements was seen as 
a way to reform the education of students with 
significant cognitive disabilities. Advocates felt 
that information gleaned from test results could 
help guide instructional programming and cur- 
ricula (Crawford &Tindal, 2006). The outcomes 
identified as a consequence of the alternate assess- 
ment would become a part of the daily education 
routine (Ford, Davern, & Schnorr, 2001). 

A review of 19 research-based articles related to 
alternate assessment policies (Browder, Spooner, 
Algozzine, et ah, 2003) summarized the under- 
lying principles of alternate assessments following 
the 1997 federal legislation. The researchers found 
four reasons why alternate assessments initially 
showed promise for state accountability and for 
students with significant cognitive disabilities. 
Advocates believed that: 



1. Greater consideration would be given to the 
students because they were participating in 
alternate assessments. 
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2. Perceptions of low expectations for the stu- 
dents would shift, affording students more 
opportunities for success. 

3. All students would have access to the general 
education curriculum and, thus, to the same 
academic content standards. 

4. The quality of instructional programming 
would increase for students with significant 
cognitive disabilities — a change with implica- 
tions for the preparation of instructors. 

A research study published in 2005 indicated that 
since December 9, 2003— when the federal gov- 
ernment announced the 1 percent rule (see box 1 
and appendix A) as a new provision of No Child 
Left Behind legislation — 22 states, or 44 percent, 
had changed their alternate assessment policies to 
use alternate achievement standards (Thompson, 
Johnstone, Thurlow, & Altman 2005). In particular, 
the researchers believe that two important changes 
have occurred under the No Child Left Behind Act: 



cognitive disabilities, they all still face great chal- 
lenges. The Southwest Region states differ in their 
definitions of significant cognitive disability and 
in their guidelines for student participation in 
alternate assessments. And the technical quality 
of their alternate assessments varies. Peer review 
letters from the U.S. Department of Education 
indicate that the five states in the Southwest Region 
received full approval, approval expected, or ap- 
proval pending for their alternate assessments (U.S. 
Department of Education, 2006a). But they needed 
to provide extensive additional evidence from their 
2005/06 alternate assessments in six areas: 

1. Academic achievement standards Three 
states needed to submit science-achievement 
standards. Two states needed to document 
cutscores and performance descriptors for 
other content areas measured. 

2. Full assessment system. Two states needed to 
submit alignment-study results and plans for 
completing alignment studies. 



• The focus of alternate assessments has shifted 

greatly toward measuring academic skills 
linked to state content standards. 



• Given the attention that alternate assessments 
now receive because of their inclusion in fed- 
eral and state formal accountability systems, 
states have attempted to increase the technical 
adequacy of their alternate assessments. 



Although the five 
Southwest Region states 
have made efforts to 



include students with 
significant cognitive 
disabilities, they all still 
face great challenges 



States across the country are struggling to address 
the needs of students with the most significant 
cognitive disabilities; to validly, reliably, and 
accurately measure student performance; and to 
develop alternate assessment sys- 
tems that yield high-quality data 
for evaluating school, district, and 
state performance and improving 
instruction. 



Although the five Southwest 
Region states have made efforts to 
include students with significant 



3. Technical quality. Three states needed to 
submit evidence of validity, reliability, and 
usability. All three needed to prepare and 
submit technical manuals. 

4. Alignment results. Three states had to com- 
plete studies of the alignment of alternate 
achievement standards to assessments and 
show their plans for filling any gaps in cover- 
age. The alignment issue is especially complex 
for alternate assessments because greater local 
choice and range of artifacts (portfolio work 
samples, such as photographs or videotapes 
of the student performing a task, audiotapes, 
writing samples, drawings, and tests) in 
alternate assessments mean that the method 
for demonstrating alignment between content 
standards and performance assessments is 
less developed than for on-demand tests. 

5. Inclusion. Three states needed to show that all 
students were included in testing (including 
groups such as migrant students). 
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6. Reporting. Three states needed to document 
and clarify their reporting practices. 

This descriptive study is intended to contribute 
to the ongoing discussion of promising and best 
practices for the assessment of students with sig- 
nificant cognitive disabilities. 



• Creating reliable and valid alternate 
assessments. 

• Defining proficient performance (setting 
standards). 

Each of these challenges is discussed below. 



CHALLENGES TO DESIGNING AND 
IMPLEMENTING ALTERNATE ASSESSMENTS 

The No Child Left Behind Act and the Individuals 
with Disabilities Education Act of 2004 represent 
a paradigm shift in instructing and assessing 
students with significant cognitive disabilities. Ac- 
cording to one study, these acts have moved special 
education from “a culture of compliance to a culture 
of accountability for results” (Manasevit & Mag- 
innis, 2005, p. 51). Yet they have also confronted 
states with technical and logistical challenges far 
greater than those created by assessments designed 
chiefly for general education students. Issues of bias, 
validity, and reliability (see appendix A) — as well as 
approaches to training and monitoring — are com- 
plicated by the heterogeneity and varying degrees 
of disability in the targeted population and by the 
nature of the performance assessments developed to 
increase access for this student group. 

Of the six research questions that guided this 
study (see appendix D), the first — “What chal- 
lenges are states encountering when imple- 
menting new alternate assessment policies and 
practices?” — prompted researchers to review the 
national literature. They found that states across 
the country face five technical and logistical chal- 
lenges as they struggle to develop standards-based 
alternate assessments : 



Deciding who should participate 



Alternate assessments are reserved for students 
with the most significant cognitive disabilities, 
but the phrase significant cognitive disabilities — 
used extensively in the literature and in federal 
guidelines — has no single authoritative defini- 
tion. The Individuals with Disabilities Educa- 
tion Act of 2004, a common source of special 
education terminology, 
does not define it. And 
only one study (Almond 
& Bechard, 2005) has 
formally addressed the 
characteristics of stu- 
dents who take alternate 
assessments. 



Alternate assessments 
are reserved for students 
with the most significant 
cognitive disabilities, but 
this phrase has no single 
authoritative definition 



Many states have defined significant cognitive dis- 
abilities using general language, such as “unable to 
participate in the general assessment even with ac- 
commodations.” But such definitions, which focus 
on deficits in functioning, do not say much about 
students as individuals (McDonnell, Hardman, & 
McDonnell, 2003). To design an assessment one 
needs a thorough understanding of the targeted 
student populations and their relevant charac- 
teristics. And the students now being targeted by 
alternate assessments form a widely heterogeneous 
group in disability characteristics, capabilities, 
and education needs (Snell & Brown, 2006). 



• Deciding who should participate. 

• Deciding what content alternate assessments 
should measure. 

• Defining technical adequacy for alternate 
assessments. 



Although the states have not precisely identified the 
characteristics of students who should participate in 
alternate assessments, many states include students 
with autism, moderate to severe mental retardation, 
multiple disabilities, and traumatic brain injury — 
some of the 13 distinct disability categories defined 
by the Individuals with Disabilities Education Act 




8 



ALTERNATE ASSESSMENTS FOR SPECIAL EDUCATION STUDENTS IN THE SOUTHWEST REGION STATES 



of 2004 and its corresponding Code of Federal 
Regulations (34 CFR, 300.7 and 300.8). Students in 
several of these federally defined categories are con- 
sidered to have significant cognitive disabilities. But 
using these federally defined labels to determine 
eligibility for participation in alternate assessments 
can be problematic. First, the labels are operation- 
ally defined in different ways across local and state 
education agencies. Second, states have typically 
used a checklist, or series of questions, completed 
by a student’s IEP team or committee to guide them 
through the state’s process for determining eligibil- 
ity. But IEP teams vary in their ability to classify 
students properly, especially students with multiple 
disabilities. Finally, decisionmaking templates vary 
widely across states, ranging from a few general 
questions about a student’s level of functioning to 
extensive multistep procedures that require con- 
sidering the student’s curriculum and document- 
ing possible testing accommodations. Researchers 
know little about these frameworks’ validity or 
about whether different state processes yield com- 
parable decisions about participation (Almond & 
Bechard, 2005; Yovanoff & Tindal, 2007). 



The procedures and 
guidelines used to 
determine student 
eligibility must be linked 
to the curriculum 



The 1 percent rule makes it es- 
sential that IEP teams correctly 
identify students for alternate 
assessment participation. The 
procedures and guidelines used 
to determine student eligibility 
must be linked to the curriculum 
(Almond and Bechard, 2005). To ensure such 
linkage — and to make informed decisions about 
whether to use an alternate assessment rather than 
accommodations on a large-scale standardized 
test — IEP teams must be thoroughly familiar with 
the format and content of state assessments and 
with state policies on testing accommodations. But 
Yovanoff and Tindal (2007) indicate that this is 
not always the case. Browder, Spooner, Ahlgrim- 
Delzell, et al. (2003) identify this as an area where 
training is sorely needed. 



The addition of a new student population under 
the 2 percent rule (students eligible for alternate 
assessments even though they are not identified 



as significantly cognitively disabled; see box 1) has 
further complicated the identification of students 
eligible for alternate assessments. States may choose 
to implement an additional assessment based on 
modified academic achievement standards. They 
have the flexibility to determine who may take the 
modified assessment but must follow the guidelines 
set forth by the U.S. Department of Education. 

According to the nonregulatory guidance, eligible 
students have disabilities under section 602(3) 
of the Individuals with Disabilities Education 
Act and have access to a curriculum based on 
grade-level standards. In each case objective 
evidence — including performance on state or 
other assessments that validly document aca- 
demic achievement — must show that a disability 
has precluded the student from achieving grade- 
level proficiency. The student’s IEP team must be 
reasonably certain that, even if significant growth 
occurs, the student will not achieve grade-level 
proficiency within the year covered by the indi- 
vidualized education program. The team must 
base this determination on the student’s progress 
in response to appropriate instruction (including 
special education and related services designed to 
address individual needs) and on multiple valid 
measures of student progress over time. Finally, 
the student’s individualized education program 
must include goals based on academic content 
standards for the grade in which the student is en- 
rolled (U.S. Department of Education, 2007, p. 16). 

For adequate yearly progress calculations, states 
may claim that students meeting these criteria and 
included in the modified assessments are “profi- 
cient” or “advanced” so long as the number of such 
scores does not exceed 2 percent of all students in 
the grades assessed (about 20 percent of students 
with disabilities). Like the 1 percent rule, the 2 
percent rule does not limit the number of students 
who may participate in a modified assessment. 

Though little is known about whether states are 
correctly identifying students with disabilities 
(mild to moderate or more severe), states are 
required to develop valid and reliable assessments 
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for them. The Individuals with Disabilities 
Education Act of 2004 instructs states to develop 
“guidelines for participation of students in alter- 
nate assessment for those children who cannot 
participate in state- and district-wide assessment 
programs” (§300.138, part B). It requires students’ 
individualized education programs to docu- 
ment the justification for their exclusion from the 
general large-scale assessment and to describe how 
they will be assessed using an alternate method. 
The challenge of deciding who should participate 
in alternate assessments needs further study, 
regionally and nationally. Across the five South- 
west Region states definitions and parameters for 
eligibility differ substantially (see tables C2 and C3 
in appendix C). The diversity of definitions across 
the country is even greater. 



standards should form the foundation for alternate 
assessment, and evidence of this link should rou- 
tinely be available (U.S. Department of Education, 
2003b, 2006b; U.S. Department of Education, Office 
of Elementary & Secondary Education, 2004). 



Across the five 
Southwest Region 
states definitions and 
parameters for eligibility 
differ substantially 



States are now struggling 
to identify outcomes on 
which to base alternate 
content standards and to 
find curriculum models 
that meet students’ needs 
in addressing these stan- 
dards. In addition, they are struggling to link the 
alternate standards to their grade-level counter- 
parts in accordance with expectations set by the 
No Child Left Behind Act. 



Deciding what content alternate 
assessments should measure 

The Individuals with Disabilities Education Act of 
2004 states that students with disabilities should 
have access to general education curricula and 
academic standards. Students with significant dis- 
abilities must have instruction and accommoda- 
tions that promote their progress, no matter how 
modest, toward meeting state and district aca- 
demic standards for the larger student population. 
The emphasis on common standards and curricula 
is a paradigm change from traditional curricula 
and inclusion practices: 

Although the law still maintains the right of 
each student with disabilities to an indi- 
vidually referenced curriculum, outcomes 
linked to the general education program have 
become the optimal target. It is no longer 
enough for students with disabilities to be 
present in a general education classroom 
(Pugach & Warger, 2001, p. 194). 

Policymakers and researchers increasingly agree 
that alternate assessments are intended to function 
as one component in a larger accountability system 
and to measure progress toward general education 
expectations. A state’s general education academic 



Outcomes and curriculum models. Test develop- 
ers and policymakers struggle over the content 
and focus of state alternate assessments. Should 
alternate assessments focus on “the content stan- 
dards (or core learning outcomes) identified for all 
students” or on “a separate, more ‘functional’ set of 
learner outcomes” (Kleinert & Kearns, 1999, p. 101)? 

The functional-skills curriculum model for stu- 
dents with significant cognitive disabilities was 
intended to promote community inclusiveness. It 
was a paradigm shift from previous developmental 
models based largely on infant and early childhood 
curricula. Developmental models hinged on the 
belief that many students with significant cognitive 
disabilities would not continue to develop intel- 
lectually as their typically developing peers would 
(Browder et al„ 2004) and were in essence based 
on students’ mental rather than chronological age. 
Early functional curriculum models, by contrast, 
focused primarily on skills for independent living, 
such as cooking, shopping, managing money, using 
public transportation, and living in the community 
(National Center on Learning Disabilities, 2007). 

Functional curricula vary from district to district, 
and often from classroom to classroom, depending 
on student needs or the mandates of an indi- 
vidualized education program. One functional 




10 



ALTERNATE ASSESSMENTS FOR SPECIAL EDUCATION STUDENTS IN THE SOUTHWEST REGION STATES 



The shift from a 
functional-skills 
assessment approach 
to one based more on 
academic skills has 
advanced the trend 
toward alternate 
assessments 



curriculum includes personal-care skills (groom- 
ing, health, dressing, attending to medical needs), 
domestic skills (shopping, cleaning, cooking, 
budgeting, planning), recreation skills (making 
social connections, using the library, swimming, 
biking), and employment skills (prevocational, 
vocational, on-the-job training, community-based 
job experiences; Provincial Out- 
reach Program for Autism, 2007). 
A functional-skills curriculum is 
child-centered, not curriculum- 
centered. It is fluid, changing 
with the needs of the student, and 
is teacher-selected and teacher- 
directed to emphasize academic 
tasks that the student will use 
daily and can apply in real life. 



The shift from a functional-skills assessment ap- 
proach to one based more on academic skills has 
advanced the trend toward alternate assessments 
across the country. The states, however, had to 
decide how to relate academic content standards to 
the alternate assessments. Would they keep stan- 
dards identical with general education standards, 
revise or amend the general education standards, 
or develop separate alternate assessment standards 
(Thurlow, 2004)? 



states are attempting to combine the functional 
curriculum with the more traditional academic 
curriculum. Because functional academic curri- 
cula vary from state to state and from classroom to 
classroom, and research on functional academics 
is limited (Browder et al., 2004), it is difficult to 
identify the best blend. The Chicago Public Schools 
offers high school classes in functional academics 
developed specifically to address the goals and ob- 
jectives of the Illinois Alternate Assessment. Most 
students in the classes have significant cognitive 
disabilities, but the classes are not limited to these 
students. Besides instruction in four main content 
areas, students take a variety of community-based 
classes, including (Chicago Public Schools, 2007): 

• Reading. This class teaches reading com- 
prehension and critical-thinking skills to 
students who can benefit from classroom 
reading instruction. Students are exposed to 
Chicago public libraries and receive weekly 
peer tutoring. 

• Math. Available to students who function at a 
basic math level necessary for further devel- 
opment, this class is geared to the individual. 
It covers such areas as time, computation, 
money usage, and basic functional math. 



A state’s response to federal alternate assessment 
requirements would reflect decisions about its 
education values and priorities. Many educators 
struggled with the expectation that they would 
teach students to read and solve mathematical 
problems when many of the students still had not 
mastered basic life skills, such as using a consis- 
tent mode for communication (speaking, gestur- 
ing, using eye gaze), lifting their heads, or groom- 
ing themselves. As states began to implement 
the mandates of the Individuals with Disabilities 
Education Act of 1997, they faced the challenge of 
negotiating between the federal requirements and 
the pervasive, often competing needs of students 
(Ysseldyke & Olsen, 1999). 

To meet federal requirements and provide students 
with disabilities access to the general curriculum, 



• Communication. This class stresses functional 
vocabulary. Students who have difficulty 
receiving or sending spoken language explore 
the use of augmentative communication sys- 
tems, such as pictures and signing, to make 
requests and share experiences. Students 
learn to identify, trace, or write sight words, 
depending on ability. Modified communica- 
tion is geared to students who require more 
intensive instruction. Students learn basic 
functional academics such as counting, num- 
ber recognition, following directions, learning 
community words and signs, and becoming 
more aware of their environment. 

• Computers. Students are introduced to 
a computer’s parts and functions and to 
various software programs. Individualized 
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enrichment activities are available depending 
on student needs. Adaptive equipment assists 
students with various degrees of physical in- 
volvement. And classes address various levels 
of computer literacy. 

States have not made academic content standards 
the only foundation for alternate assessments 
(Browder, 2006; Towles-Reeves, Muhomba, & 
Kleinert, in press). Functional skills remain a 
priority in instruction and have not been elimi- 
nated from alternate assessment models. But 
many states, to meet policy regulations and more 
broadly educate their students, are using academic 
tasks and contexts for alternate assessments in 
reading and math. Thus, states are likely using 
a combination of academic and functional skills 
linked to state standards (Browder, Spooner, 
Ahlgrim-Delzell, et al., 2003). Most are leaning to- 
ward linking general education content standards 
to alternate assessments. 

As states move from functional skills to more aca- 
demic skills, teachers will need to understand the 
broader policy context and to be able to implement 
new curriculum expectations. Ascertaining the 
right blend of skills can be challenging. Teach- 
ers’ abilities vary, as do student populations and 
state guidelines (Thurlow, 2004). Although some 
local education agencies and institutions of higher 
education are stepping up to support educators 
who must implement the assessments and align 
special education curricula, more data need to be 
collected in this area. 

Alignment and linkage. Several researchers have 
developed models that differentiate alignment and 
linkage. Alignment designates the degree to which 
content (such as skills and concepts) concurs in 
two sets of standards or in an assessment and a 
set of standards. Alignment relationships tend 
to be direct relationships (skill- content matches) 
and are typically observed between standards and 
assessments for a single student population (such 
as general education, special education, or English 
language learners). Linkage refers to relationships 
that tend to be developmental, foundational, or 



proximal and are typically observed between 
standards or assessments developed for different 
populations (such as general education standards 
and alternate standards; WestEd, 2004). 



Researchers performing linkage studies need 
to include reviewers with content expertise and 
knowledge of the targeted student population. 
Relationships between sets of standards are typi- 
cally less direct than those found with alignment 
studies, often linking precursor or support skills 
with general education grade-level expectations. 



The Individuals with Disabilities Education Act 
of 2004 and the No Child Left Behind Act support 
the design of alternate assessments as an extension 
or modification of state assessment systems based 
on general education standards. States are devel- 
oping and refining their alternate assessments 
to focus more on these standards: “In 1992, 32 
percent of states were using only functional skills 
for their alternate assessments with no link to state 
standards!;] by 2001 only 8 percent were doing so” 
(Browder, Fallin, Davis, & Karvonen, 2003, p. 259). 
Similarly, a 2005 survey of state departments of 
education indicated that 90 percent of states were 
adhering to the requirements of the No Child Left 
Behind Act and the Individuals with Disabilities 
Education Act of 2004, 
using academic content 
standards as the basis 
of their alternate assess- 
ments in linking func- 
tional skills to content 
standards (Thompson 
et al„ 2005). 



Aligning alternate 
assessments to state 
academic content 
standards is critical. 

But such alignment 
remains a complex issue 



Aligning alternate assessments to state academic 
content standards is critical for two reasons. First, 
it is intended to provide access to the general 
curriculum for students with significant cognitive 
disabilities, setting high expectations for stu- 
dents. Both the letter and spirit of the law reflect 
this intention (U.S. Department of Education, 
2005, p. 6; Ysseldyke, Dennison & Nelson, 2004). 
Second, such assessment data are believed to lead 
to practices that improve instruction quality and 
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assessment policies (McDonnell, McLaughlin & 
Morison, 1997; Ysseldyke, Dennison & Nelson, 
2004). But alignment remains a complex issue for 
alternate assessments. Because of the greater local 
choice and range of artifacts in these assessments, 
the method for demonstrating alignment between 
content standards and performance assessments is 
less developed than for on-demand tests. 



researchers have called for more studies to find the 
right blend of functional and academic standards 
(including performance indicators that reflect that 
blend), to help states give students with severe dis- 
abilities access to the general curriculum, and to 
improve alignment criteria (Browder et al., 2005; 
Browder et al., 2002; Browder, Fallin, et al., 2003; 
Browder, Spooner, Ahlgrim-Delzell, et al., 2003). 



There is no consensus 
on what evidence 
should be collected in 
alternate assessments 
or how much of it is 
sufficient to ensure 
technical adequacy 



Browder, Spooner, Ahlgrim-Delzell, et al. (2003) 
question how well versed special education teach- 
ers are in linking functional skills to content 
standards, concluding that teachers need to be 
trained on how to use functional academics. For 
example, while teachers are adept at using sight 
words in instruction, they need more training 
on how to teach story patterns and tie in their 
personal experiences (Browder, 2006; D. Farley, 
Education Consultant, Special Education Bureau 
at the New Mexico Public Education Department, 
personal communication, June 27, 2007; Louisiana 
Department of Education, 2006; Towles-Reeves 
et al., in press). Towles-Reeves et al. identified 18 
studies conducted between December 2002 and 
2006 examining the readiness of special educa- 
tion teachers to implement a 
standards-based curriculum. 

Their summary provides further 
evidence of a disconnect between 
the intent to provide instruction 
based on academic content to 
students with significant cognitive 
disabilities and the practical abili- 
ties of teachers. 



Browder et al. (2005) have explored the problem 
of validating performance indicators with content 
experts and stakeholders. They found that 42 states 
developed rubrics to measure student performance, 
but all states needed help providing evidence of re- 
liability and validity from scores (see appendix A). 

Other researchers have observed a change in 
states’ curriculum philosophies, as they shift from 
functional to more academic alternate assessment 
achievement standards and performance descrip- 
tors. Because the academic focus is fairly narrow, 



Defining technical adequacy for alternate assessments 

The development or redesign of an alternate as- 
sessment will be driven by the design imperatives 
for technically sound assessments. All good assess- 
ments must be valid, reliable, and usable (Forte 
Fast, 2004; E. Forte, president, edCount, LLC, per- 
sonal communication, February, 2006; Rabinowitz 
& Sato, 2005). But models for investigating the 
technical adequacy of alternate assessments are 
scarce or nonexistent. Only a few research studies 
are available on this topic. Although criteria for 
evaluating the validity of general education tests 
have been extensively written about (American 
Educational Research Association, American 
Psychological Association, & National Council on 
Measurement in Education, 1999; Green, 1998; 
Messick, 1993; Webb, Horton, & O’Neal, 2002), Yo- 
vanoff and Tindal (2007) cite only seven published 
studies of alternate assessment validation, with 
much of the research based on surveys. 

There is no consensus on what evidence should be 
collected in alternate assessments or how much of 
it is sufficient to ensure technical adequacy. The 
most significant finding in the literature is that 
alternate assessments for students with the most 
significant cognitive disabilities typically lack 
adequate psychometric properties, with shortcom- 
ings in validity, reliability, and usability. 

Validity. Validity is the extent to which a test mea- 
sures what it was designed to measure. According 
to standard 1.1 in the Standards for educational 
and psychological testing (American Educational 
Research Association et al., 1999), validity evi- 
dence should be collected for every intended inter- 
pretation and use of the scores that a measurement 
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instrument yields (U.S. Department of Education, 
Office of Elementary & Secondary Education, 
2004). Alternate assessments are intended to serve 
several purposes: 

• To provide a measure of student proficiency 
to inform parents, to help teachers plan 
instruction for the following year (so that the 
evaluation of instructional programs informs 
ongoing instruction), and to comply with 
federal mandates — especially adequate yearly 
progress requirements. Browder et al. (2005), 
Snell and Brown (2006), and Towles-Reeves 
et al. (in press) indicate that results from 
alternate assessments should help teachers 
determine functioning at the time of testing 
and identify specific skills acquired, those 
requiring continued instruction, and sup- 
port needed, including assistive technologies. 
Ideally, this process will inform the student’s 
individualized education program and sup- 
port a plan for instruction for the following 
year (Towles-Reeves et al., in press). 

• To hold teachers, schools, and districts ac- 
countable for implementing standards-based 
curricula and using assessment results to 
improve student learning. The annual devel- 
opment and administration process helps to 
focus educators on the development, instruc- 
tion, and assessment of performance goals 
aligned with state performance standards 
(Towles-Reeves et al., in press). Since the in- 
ception of the No Child Left Behind Act, such 
results must be sufficiently valid to support 
adequate yearly progress decisions. 

• To inform and support program evaluation 
at the classroom, school, and district levels, 
including the identification of resources that 
may further support instruction and provide 
topics for professional development (Towles- 
Reeves et al„ in press). 

Validity evidence traditionally takes several forms 
and comes from a variety of sources. But most evi- 
dence supporting the use of alternate assessments 



for the indicated purposes consists of descriptions 
of how assessments were developed and the evi- 
dence submitted to peer review (R. Quenemoen, 
Technical Assistance Team Leader, National 
Center for Educational Outcomes, personal com- 
munication, October, 2006; M. Thurlow, Direc- 
tor, National Center for Educational Outcomes, 
personal communication, 2005). Although such 
evidence is important, other evidence is needed 
(U.S. Department of Education, Office of Elemen- 
tary & Secondary Education, 2004; U.S. Depart- 
ment of Education, 2006a). Yet it has not always 
been collected. 



Alternate assessments 
are individualized, 
making it difficult to 
collect traditional 
validity evidence 



Alternate assessments are 
individualized, making it 
difficult to collect tradi- 
tional validity evidence. 

A score can have different 
meanings for different 
students, depending on 

their instructional goals and the characteristics 
of the assessments administered (Schafer, 2005; 
Towles-Reeves et al„ in press). So, validity evi- 
dence must focus more on consequential aspects 
for alternate assessments than for traditional 
on-demand assessments — which exist primarily to 
measure what has occurred, not to influence how 
instruction should take place and be supported. 



Some forms of validity evidence adapted to alter- 
nate assessment include: 



• Intrinsic rational validity evidence. An artifact 
of the test development process, this evidence 
is intrinsic because it is built into the test, and 
rational because it is derived from rational 
inferences about the kinds of tasks that will 
best meet measurement goals (Ebel, 1983). 

In most states, including Southwest Region 
states, the evidence submitted to peer review 
on assessment development served as intrinsic 
rational validity evidence (U.S. Department of 
Education, 2006a). Some states in the South- 
west Region, especially those redesigning 
their assessments, might benefit by reviewing 
how to summarize this evidence. 
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• Content-related or curricular validity evidence. 

This evidence addresses how adequately 
assessment tasks are aligned to an assess- 
ment’s intended focus (material or standards). 
Several features of the annual development 
process, such as teacher training and a test 
administration manual, can provide evidence 
that assessment results measure intended 
content-standard objectives (Towles -Reeves 
et al., in press). 

• Face validity. Addressing whether an assess- 
ment appears to measure what it is supposed 
to measure, this evidence is important to any 
assessment program (though no respected 
technical source considers it a substitute for 
either conceptual or data-derived validity 
evidence). Face validity can help teachers, 
parents, and community members accept the 
results of an assessment. If they do not see it 
as relevant or understand its purpose, they are 
less likely to give it their attention and support. 
A test’s face validity is typically gauged by how 
stakeholders respond to the use of test results to 
inform instruction and monitor accountability. 
One can learn this by periodically surveying 
parents, teachers, and other groups of interest. 
Some Southwest Region states took another 
approach: reviewers made a frequency count of 
the skills that teachers selected on portfolios or 
performance tasks and computed correlations 
(National Alternate Assessment Center, 2005; R. 
Quenemoen, Technical Assistance Team Leader, 
National Center for Educational Outcomes, 
personal communication, October, 2006). 



Reliability for alternate 
assessments can 
be conceptualized 
in several ways 



Consequential validity evidence. A test’s ap- 
propriateness to a set of assessment goals is 
determined by evaluating the intended and 
unintended consequences of the assessment 
process and results (Messick, 1993). This is 

especially important for alternate 
assessments, where the develop- 
ment and administration processes 
can be unusually complex and 
labor intensive. Many states are 
developing monitoring protocols 



and teacher professional-development sup- 
port materials to ensure that the alternate 
assessment process examines how classroom 
practices change, both in administering the 
assessment and in incorporating the results 
into instruction. Research on training teachers 
to understand alternate assessment’s positive 
aspects and to use results in the classroom is a 
part of consequential validity. States are begin- 
ning to analyze and collect evidence for this 
process (Browder, Spooner, Ahlgrim-Delzell, et 
al„ 2003; Browder, Fallin, et al„ 2003; D. Far- 
ley, Education consultant, Special Education 
Bureau at the New Mexico Public Education 
Department, personal communication, June 
27, 2007; Towles-Reeves et al., in press). 

Reliability. Reliability quantifies the consistency 
of measurement results. Consistent measurements 
are a prerequisite to interpreting scores appropri- 
ately. For traditional large-scale assessments, with 
standardized items, administration, and scoring 
procedures, computing reliability is fairly straight- 
forward. Most state assessment programs use a 
measure of internal consistency that makes reli- 
ability above 0.8 (and often exceeding 0.9) readily 
attainable, given sufficient sample sizes (typically 
in the tens of thousands) and number of items (30 
and above; Nitko, 1996). 

Alternate assessments are far less standardized in 
their prompts (see appendix A) and their admin- 
istration and scoring protocols. And — because of 
the 1 percent rule, the logistical support required, 
and the perceived burden on teachers — sample 
sizes for alternate assessment are usually well 
below those for more traditional programs. Other 
reliability challenges include the limited popula- 
tion of students and teachers to support training 
and the impossibility of certain statistical data 
analyses and checks that depend on sample size, 
such as bias review procedures and discrimina- 
tion indices (Browder, Fallin, et al., 2003; Browder, 
Spooner, Algozzine, et al., 2003). 

Reliability for alternate assessments can be con- 
ceptualized in several ways. One is the consistency 
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of the observed outcomes associated with a given 
skill (S. Hall, Division of Special Education and 
Early Intervention Services, Maryland State 
Department of Education, personal communica- 
tion, May 22, 2006; National Alternate Assess- 
ment Center, 2005). If a student has mastered a 
skill, proficiency should be evident over multiple 
settings and occasions. Inconsistencies suggest 
that mastery interpretations cannot be generalized 
beyond the conditions of the original assessment 
task or that the student’s performance was scored 
incorrectly (Browder, 2006; Towles-Reeves et al„ 
in press). 

Consistent use of a specified scoring process 
is important to reliability. The procedures and 
materials for training scorers must be standard- 
ized, not only for a given year but also across 
administrations whenever possible. To eliminate 
ambiguity the scoring process and rules must be 
documented. Validity and scoring reports must 
be reviewed daily so that scoring supervisors can 
identify scorers who are starting to drift. Finally, 
there must be scoring agreement. 

Usability. Unlike for validity or reliability, there 
are no general guidelines or statistical indices to 
determine the usability of a test or assessment 
program (S. Hall, personal communication, May 
22, 2006; Schafer, 2005). Many variables influence 
decisions about usability. A hotly debated question 
about the use of assessments for accountability is 
how useful test results are for teaching and learn- 
ing. When students as a whole do poorly on a test, 
either the test is a poor measure of their learning 
or it accurately reflects the fact that they did not 
learn. Whether it is a poor measure, and thus not 
usable for instructional decisionmaking, depends 
primarily on alignment — that is, whether the 
test is a good (reliable and valid) measure of the 
curriculum or standards to be mastered. If a test 
is aligned with the curriculum, teachers can use 
results to evaluate learning and instruction. 

A key premise underlying the use of alternate 
performance-based assessments for students 
with significant cognitive disabilities is that such 



How useful are test 
results for teaching 
and learning? 



assessments can be 
curriculum-embedded 
(administered as part 
of regular classroom 
activities) and directly 
shape future instruction. Such consequential 
validity evidence is rarely collected. Another key 
usability issue is how the assessment results are 
communicated. Stating results in terms that most 
consumers — especially teachers — can understand 
helps teachers teach and helps parents and stu- 
dents understand student performance. 



Alternate assessments are useful when they repre- 
sent what students have been taught and when they 
yield consistent, accurate scores. If these conditions 
are met, states can make confident inferences about 
classroom performance from test scores. When 
academic standards influence classroom instruc- 
tion (as some state content standards have), it is 
reasonable to use test scores in a content area as 
evidence for how much students have acquired the 
knowledge and skills specified by the standards. 



To build the technical adequacy of alternate as- 
sessments, Kleinert, Browder, and Towles-Reeves 
(2005) are applying and generalizing the cognitive 
psychology-based assessment triangle (Pellegrino, 
Chudowsky, & Glaser, 2001)— originally developed 
for general education students — to students with 
significant cognitive disabilities (see appendix A). 
Although this framework holds much promise, 
more research is needed to understand its full 
applicability (E. Towles-Reeves, Research Coor- 
dinator, National Alternate Assessment Center, 
personal communication, May 2, 2006). 



Creating reliable and valid alternate assessments 

Because traditional paper-and-pencil tests are in- 
appropriate for students with significant cognitive 
disabilities, states have had to consider alternative 
approaches and to build more valid instruments. 
Significantly cognitively disabled students tend 
to have limited communication skills — some 
being nonverbal — and extremely low academic 
achievement levels. They need highly specialized 
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instruction and support, such as augmented com- 
munication systems (Almond & Bechard, 2005). 
These needs are often complicated by English 
language learner or low socioeconomic status. The 
broad heterogeneity of this population requires a 
broad and flexible assessment approach. 

Although researchers have made progress deter- 
mining the technical requirements of alternate 
assessments (see, for example, Rabinowitz & Sato, 
2005), their adequacy continues to lag behind 
that of their general education counterparts — 
primarily assessments with multiple-choice and 
short and extended constructed response ques- 
tions. Given the range of student needs, one size of 
alternate assessment will not fit all. 

Three alternate assessment approaches are seen as 
most promising (for a fuller description of each, 
see Yovanoff & Tindal, 2007, p. 185; Roeber, 2002; 
and Towles -Reeves, et al., in press): 

• Checklists/rating scales based on observations. 

Teachers are asked to rate whether students 
can perform selected behaviors. Scoring is 
based on the number of skills the student can 
perform successfully. 

• Portfolios/bodies of evidence. Teachers sys- 
tematically collect student work samples (see 



appendix A), which are evaluated or judged 
against predetermined scoring criteria. Some 
states select the standards to be assessed. Oth- 
ers allow teachers on IEP teams to select what 
is assessed. 

• Performance assessments (performance 
events). A series of tasks are administered 
and scored in terms of exemplars of proficient 
performance. 

Researchers believe that these methods can be tai- 
lored to the needs of students with significant cog- 
nitive disabilities and provide substantively more 
access than traditional multiple-choice assess- 
ments. They have been undergoing psychometric 
evaluation for some time (Bennet, 1993; Messick, 
1996; Traub & Fisher, 1977; Thissen, Wainer, & 
Wang, 1994, as cited in Yovanoff 8c Tindal, 2007). 

Table 1 summarizes the assessment approaches 
commonly used nationally and how the prevalence 
of each approach has changed since the No Child 
Left Behind Act took effect. 

The National Center on Educational Outcomes 
reported that in 2005 nearly half the states 
across the country were using either portfolio or 
performance-based assessments (Thompson et al., 
2005). Performance and portfolio assessments are 



TABLE 1 

Alternate assessment approaches commonly used by states (number and percent of states), 1999-2005 





Portfolio, 
performance, or 
body of evidence 
assessments 


Rating scale or 
checklist assessments 


Individualized 
education program 
analysis 


Other 

assessments 


Assessments in 
development 
or revision 


Year 


Number 


Percent 


Number 


Percent 


Number 


Percent 


Number Percent 


Number 


Percent 


1999 


28 


56 


4 


8 


5 


10 


6 


12 


7 


14 


2001 


24 


48 


9 


18 


3 


6 


12 


24 


2 


4 


2003 


23 


46 


15 


30 


4 


8 


5 


10 


3 


6 


2005 a 


25 


50 b 


7 


14 c 


2 


4 


7 


14 


8 


16 



a. One state had not developed any statewide alternate assessment approaches. 

b. Of these 25 states, 13 use a standardized set of performances, events, tasks, or skills. 

c. Of these seven states, three require the submission of student work. 

Source: Thompson et al., 2005. 
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appealing because they can provide rich descrip- 
tions of students’ real-life knowledge and skills 
(Elliott & Fuchs, 1997). But Browder, Spooner, 
Algozzine, et al. (2003) have expressed concerns 
with performance-based alternate assessments, 
suggesting that their technical characteristics and 
limitations might have led to suspect outcome 
scores for students and schools. In addition, the 
same study argues (based on initial data from 
Kentucky’s efforts) that state portfolio-based 
alternate assessments might face challenges to the 
reliability of their scores. Finally, the study has 
questioned how well versed special educators are 
in linking functional skills to content standards. 

Its authors assert that much training is needed 
in this area — both to improve instruction and 
to meet federal mandates (Browder, Spooner, 
Algozzine, et al„ 2003). 

Because portfolios are work samples collected 
over time, many states employ a rubric to score 
portfolio-based alternate assessments. Some states 
permit the students’ teachers to score the work, 
for efficiency and in the belief that knowledge of 
the student and the context in which the work was 
collected is essential for valid scoring. Other states 
maintain a more rigorous process using more 
than one teacher to score student work to protect 
against bias and to measure interrater agreement. 
Although external scoring might seem advanta- 
geous, the administrative process and the artifacts 
for this student population make off-site scoring 
challenging. In either approach, allowing for per- 
sonal judgment jeopardizes the scoring’s reliability 
and validity (Kleinert, Farmer-Kearns, & Kennedy, 
1997; North Central Regional Education Labora- 
tory, 2007; Thompson & Thurlow, 2003). 

The reliability of ratings was also challenged in 
states attempting to use portfolios and perfor- 
mance assessments as part of their general large- 
scale assessments, such as Arizona and Vermont. 
Indeed, these states were prevented from publicly 
reporting their assessment results (Koertz, Mc- 
Caffrey, Klein, Bell, & Stecher, 1993; Tindal et al., 
2003). Performance assessments may need to in- 
clude numerous tasks and work samples if they are 



to demonstrate adequate coverage of state stan- 
dards and provide generalizable results. Because 
such processes are extensive and time-consuming, 
they are contraindicated for this population — one 
in which students must be assessed on a one-on- 
one basis, tend to tire 
easily, and lack long 
attention spans (Tindal et 
al., 2003; Browder, Fallin, 
et al., 2003; Almond & 

Bechard, 2005). 



Demonstrating reliability 
in performance-based 
alternate assessments 
can be challenging 



Demonstrating reliability in performance-based 
alternate assessments can be challenging. The 
primary determinant of reliability is the number 
of items or tasks on a test. Alternate assessments 
typically consist of a few larger tasks, rather than 
40-60 discrete multiple- choice items. Because reli- 
ability depends partly on the number of discrete 
tasks that make up an assessment, it is possible to 
increase an alternate assessment’s reliability by 
breaking larger tasks down into smaller subtasks 
for analysis. Still, the wide range of skills and tasks 
targeted by performance-based alternate assess- 
ments creates further challenges for comparabil- 
ity and for determinations of across-the-board 
technical adequacy. Browder, Spooner, Algozzine, 
et al. (2003) also identify student risk factors (such 
as unstable student behavior or health status) as 
possible influences on alternate assessment results. 
For on-demand performance tasks, fluctuations in 
student behavior or physical well-being could yield 
invalid results. 



Performance-based alternate assessments typi- 
cally employ some local “human” scoring. States 
can address this additional source of unreliability 
by carefully training scorers, monitoring imple- 
mentation, and moderating locally derived scores. 
Allowing teachers a choice of performance tasks 
raises additional validity and reliability questions. 
Several states are using a combination of methods 
(such as portfolios with a checklist). This multi- 
method approach is designed both to increase the 
technical quality of the results and to meet the 
diverse access needs of a heterogeneous student 
population (see appendix B for demographic 
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information). All approaches need further study 
and articulation, both individually and in com- 
bination (Thompson et al., 2005; Towles-Reeves 
et al., in press). 



Defining proficient performance (setting standards) 



In December 2003 the U.S. Department of Educa- 
tion proposed a change in policy for the No Child 
Left Behind Act and students with significant dis- 
abilities. It recommended that states be permitted 
“to define alternative achievement standards for 
students with the most significant cognitive dis- 
abilities. Such students will take an alternate as- 
sessment. These alternate achievement standards 
must be aligned with the state’s academic content 
standards and reflect professional judgment of 
the highest learning standards possible for those 
students” (U.S. Department of Education, 2003b). 
States now had to decide what content to include 
on alternate assessments and how to define con- 
tent mastery. Several states — including all five in 
the Southwest Region— defined mastery by setting 
standards for their alternate assessments (Lewis, 
Mitzel, & Green, 1996; Roach & Elliott, 2004; see 
table C 7 in appendix C). 



More and more, 
states are using the 
body of knowledge 
method for setting 
alternate assessment 
performance standards 



To set alternate achievement standards is to es- 
tablish cutscores corresponding to the knowledge, 
skills, and competencies that constitute profi- 
ciency at each level of performance. As important 
as the cutscores themselves is a performance 
descriptor (see appendix A) that indicates typical 
student knowledge, skills, and abilities for a given 
score. The descriptor helps the teacher identify 
skills and abilities that a given student cannot yet 
perform consistently, communicate with others 
about the student’s progress, and determine next 
year’s instructional goals. Finally, it indicates the 

student’s status relative to state 

learning standards (Roach & 
Elliott, 2004). 



The most common standard 
setting method for general as- 
sessment, the bookmark method 
(Lewis et al., 1996), is better 



suited to assessments with multiple items. (For 
more information on the bookmark method, 
see Kiplinger, 1997 and Olsen, Mead, & Payne, 
2002.) More and more, states are using the body 
of knowledge method for alternate assessments, 
since it is better suited to assessments that include 
direct student performance (such as work samples; 
S. Bechard, Director, Office of Inclusive Educa- 
tional Assessment at Measured Progress, personal 
communication, December 2006). 

Using bookmark, reviewers must make fine dis- 
tinctions between performance levels on adjacent 
items with slightly different item response theory 
difficulty values (Hambleton & Swaminathan, 
1985; Tindal et al., 2003). Item response theory is 
the study of test and item scores based on assump- 
tions concerning the mathematical relationship 
between abilities (or other hypothesized traits) 
and item responses (Baker, 2001). When assess- 
ments are more task-driven (Tindal et al., 2003), 
the grain size makes item-by-item comparisons 
more difficult to differentiate. Body of knowledge 
takes a more holistic view of performance, requir- 
ing standard setters to review and differentiate 
between profiles of performance rather than 
individual items. 

Several issues specific to setting performance 
standards on alternate assessments for students 
with the most significant cognitive disabilities 
remain unsettled (Almond & Bechard, 2005; 
Thompson et al., 2005). First, given the diversity 
of this population, applying one standard for all 
is problematic — especially since many, because 
of their disabilities alone, may never be able to 
meet the state-determined standard. Equally 
challenging is setting standards on performance 
assessments, given the range of student evidence 
obtained and the ability of teachers to develop and 
administer tasks of their own. 

The relatively small number of students in some 
states makes obtaining sufficient datasets for com- 
plex standard setting procedures difficult. Finally, 
a lack of clarity about the targeted population and 
content standards often yields vague performance 
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descriptors. Three of the five states in this study 
received negative comments related to their setting 
of standards from a national peer review process 
to determine whether state standards meet No 
Child Left Behind requirements (U.S. Department 
of Education, 2006a). 



A REVIEW OF ALTERNATE ASSESSMENTS 
ACROSS THE SOUTHWEST REGION STATES 

This section focuses on alternate assessments 
across the Southwest Region states. It addresses 
the last five of the six research questions (2-6) that 
guided this study, namely: 

2. What do alternate assessments across the 
Southwest Region states look like? 

3. What training or professional develop- 
ment is provided for educators on alternate 
assessments? 

4. How are results collected and used at the 
state, district, school, and student levels? 

5. To what extent do states’ alternate assess- 
ments capture the same or similar skills as 
state tests designed for the general student 
population? 

6. What technical issues are states facing in de- 
veloping and implementing reliable and valid 
alternate assessments? 

In asking these questions about each South- 
west Region state, the authors found that all five 
states were making efforts to develop policies 
and practices to include students with signifi- 
cant cognitive disabilities— though all five states 
still faced significant challenges. The Southwest 
Region states report including more students with 
significant cognitive disabilities in state assess- 
ments, increasing student exposure to the general 
curriculum, and reforming special education 
curricula and instruction (see appendix C). Yet the 
five states differ in their definitions of significant 



cognitive disability, their guidelines for student 
participation, and the technical quality of their 
assessments. 

Arkansas and Oklahoma, which have achieved full 
approval status from federal peer review, report 
that they are continuing to improve programming 
and instruction by refining content achievement 
standards to reflect a blend of functional and 
academic skills linked to content standards. Loui- 
siana has changed from the Louisiana Educational 
Assessment Program Alternate Assessment (LAA), 
a checklist, to a performance-based assessment 
(LAA 1). It considered moving to portfolio as- 
sessment but opted for a traditional test (Jeanne 
Johnson, Education Consultant, Louisiana Depart- 
ment of Education, personal communication, 
January 30, 2008). New Mexico has switched from 
a checklist based wholly on functional achieve- 
ment to performance-based tasks that are linked 
to alternate achievement standards. Texas has 
created alternate achievement standards based on 
state general content standards and is transition- 
ing from locally selected alternate assessments and 
an optional state-developed alternate assessment 
to a uniform state-developed portfolio system. 



What do alternate assessments across the 
Southwest Region states look like? 



Demographic context. To facilitate meaningful 
comparisons table 2 summarizes state demo- 
graphic information for the Southwest Region 
states. (For more specific information about state 
characteristics, see appendix B.) 



Texas has by far the largest enrollment, with six 
times as many students enrolled in K-12 as the 
state with the second- 
largest enrollment (Loui- 
siana) and significantly 
more public schools and 
public school districts 
(local education agencies) 
than the other four states. 

While it takes similar 
efforts to design, plan, 



All five states were making 
efforts to develop policies 
and practices to include 
students with significant 
cognitive disabilities — 
though they still faced 
significant challenges 
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TABLE 2 

Summary of state demographic statistics for the Southwest Region states 



Statistic 


Arkansas 


Louisiana 


New 

Mexico 


Oklahoma 


Texas 


K-12 enrollment 


463,115 


724,281 


326,102 


629,426 


4,405,215 


Public schools (number) 


1,158 


1,541 


842 


1,747 


8,746 


Local school districts (number) 


252 


68 


89 


540 


1,227 


Students receiving free or reduced-price lunch (percent) 


52 


62 


58 


54 


48 


Students receiving special education or related services (percent) 


12 


14 


20 


15 


12 


Students receiving English language learner services (percent) 


4 


2 


19 


7 


16 



Source: State report cards for Arkansas (http://normessasweb.uark.edu/reportcards/state05.php), Louisiana (http://www.doe.state.la.us/lde/pair/ 
StateReport0405/1 0-Student_Achievement.pdf), New Mexico (http://www.ped.state.nm.us/div/acc.assess/accountability/dlRptCard2005/NMStateReportCard 
%20English.pdf), Oklahoma (http://title3.sde.state.ok.us/studentassessment/2005results/reportcard2005state.pdf), and Texas (http://www.tea.state.tx.us/ 
research/pdfs/2005_comp_annual.pdf). 



and devise assessment systems, differences will 
arise in resources, cost, and logistics when materi- 
als have to be distributed, collected, retrieved, 
and scored for larger numbers of students. For 
instance, although development of the assessment 
itself is a roughly equivalent task in each state, 
state scoring of portfolios or performance events 
for 40,000 students in Texas requires vastly greater 
resources and planning than for 2,200 students 
in New Mexico — an issue that influenced Texas’s 
decision to have teachers score tests and submit 
scores themselves. In New Mexico, by contrast, the 
state will score the results for 2007/08 (D. Farley, 
Education Consultant, Special Education Bureau 
at the New Mexico Public Education Depart- 
ment, personal communication, June 27, 2007; C. 
Wieland, Director, Special Education Assessments, 
Texas Education Agency, personal communica- 
tion, June 2007). 

Special education students can be expected to be 
overrepresented in the group of students receiving 
free or reduced-price lunches (Individuals with 
Disabilities Education Act regulations, 34 CFR 
300.8). National literature and state reports show 
that low socioeconomic levels (as represented by 
this indicator) affect academic performance for 
both general education students and their special 
education counterparts. 

For K-12 students in the five Southwest Region 
states, race and ethnicity are another important 



factor affecting assessments (see appendix B, 
table Bl). In Louisiana, New Mexico, and Texas 
non-White students are a majority. In New Mexico 
fiispanic students are the majority (53 percent). 
Texas reports a 45 percent Fiispanic population. 
Although not all Hispanic students speak Spanish 
as their primary language, such data underscore 
the concerns of state contacts who note that ad- 
ditional assessment and instruction issues arise 
for special education students who are English 
language learners — for example, they may require 
additional language-based accommodations such 
as access to dictionaries or translations (D. Farley, 
Education consultant, Special Education Bureau 
at the New Mexico Public Education Department, 
personal communication, June 27, 2007; C. Wie- 
land, personal communication, 2007). 

New Mexico (20 percent), Oklahoma (15 percent), 
and Louisiana (14 percent) have the highest per- 
centages of students receiving special education 
and related services, all higher than the national 
average of 10-12 percent. New Mexico has the 
highest percentage of students receiving both 
special education and English language learner 
services, with Texas next. Both states report high 
numbers of students receiving special educa- 
tion and English language learner services. Such 
numbers exacerbate the difficulties of developing 
alternate assessments and training teachers to 
implement them and interpret the results prop- 
erly. State contacts expressed a need for technical 
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assistance to develop better ways of assessing 
non-English-speaking students with significant 
cognitive disabilities (D. Farley, Education consul- 
tant, Special Education Bureau at the New Mexico 
Public Education Department personal com- 
munication, June 27, 2007; C. Wieland, personal 
communication, June, 2007). 



mental disability) and describes having a signifi- 
cant cognitive disability as having severe mental 
disabilities, being multiply disabled, or having 
autism or traumatic brain injury. In Oklahoma the 
IEP teams and in Texas the admission, review, and 
dismissal teams define cognitive disability based 
on state-specified criteria. 



Heterogeneity in the population of most signifi- 
cantly cognitively disabled students intensifies 
the technical and logistical challenges for alter- 
nate assessment, as states attempt to create and 
implement valid, reliable, and bias-free tests for a 
population with large numbers of poor students, 
minorities, or both. 

Population definition. States are permitted to 
differ in their definitions, policies, and practices 
related to alternate assessments. This study found 
that the five Southwest Region states used several 
frameworks to identify students for participation, 
with eligibility parameters differing substantially 
across the states. 

Each state defined significantly cognitively disabled 
differently, within the constraints of its own stat- 
utes and administrative codes as well as federal 
guidance (see table C3 in appendix C). And each 
state operationalized its definition differently (see 
table C2). Variability in screening criteria has con- 
tributed to state-to-state variations in state assess- 
ments and to varying and perhaps unclear student 
characteristics (Thurlow, 2004). The screening 
criteria are broad, general, and sometimes am- 
biguous. They are specific to the group of students 
with significant cognitive disabilities, but not to 
subgroups within this broadly defined group. 

Arkansas had the most detailed definition, close 
to the American Association of Mental Retarda- 
tion’s description (2004). The Arkansas definition 
indicates that students in this population have 
difficulty functioning cognitively across settings, 
have limited academic skills, and require extensive 
supports. New Mexico uses a checklist that covers 
these points. Louisiana has a description that spec- 
ifies a diagnosis (moderately severe or profound 



Numbers participating in alternate assessments. 

Through the survey conducted for this study, con- 
tacts from each state indicated that the number of 
students with significant cognitive disabilities tak- 
ing alternate assessments was consistent with their 
percentage of the state’s student population (table 3). 

In each state the contacts asserted that the state 
guidelines and definitions were adequate and fit 
their state context. The Arkansas contact pointed to 
a passage from the federal nonregulatory guidance 
concerning the 1 percent rule: “The rule does not 
limit the number of students with the most signifi- 
cant cognitive disabilities who may take alternate 
assessments based on alternate achievement stan- 
dards when that is appropriate. It addresses only 
the inclusion of proficient and advanced scores for 
alternate assessments based on alternate achieve- 
ment standards in adequate yearly progress calcu- 
lations” (U.S. Department of Education, 2005). 



Texas has 40,000 students taking alternate assess- 
ments (C. Wieland, personal communication, June, 
2007). The other states had far fewer students, but 
their numbers are consistent with the respective 
size of the states. Available state reports and inter- 
views indicate that students currently participating 
in alternate assessment typically have a special 
education label of autism, mental retardation, or 
multiple disabilities. This finding is consistent 
with research by Almond and Bechard (2005) and 
the National Alternate 
Assessment Center (2005). 



Purpose of testing. The 

five Southwest Region 
states have laws, statutes, 
regulations, codes, board 
policies, and other public 



Additional assessment 
and instruction issues 
arise for special 
education students 
who are English 
language learners 
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TABLE 3 

Students taking alternate assessments in Southwest Region states, by grade, 2005/06 





Arkansas 


Louisiana 


New Mexico 


Oklahoma 


Texas 


Total enrollment K-12 


463,115 


724,281 


326,102 


629,426 


4,405,215 


Alternate assessment grades tested 


3-8, 11 


3-8, 10 


3-10 


3-8, 10-12 


3-10 


Students enrolled in grades tested 


246,399 


346,134 


204,488 


400,692 


2,724,679 


Students taking alternate 












assessment in grades tested 


3,700 


4,800 


2,200 


3,000 


40,000 


Percent of all students in grades 












tested taking alternate assessment 


1.5 


1.4 


1.1 


0.8 


1.5 


Percent of all students in grades 












K-12 taking alternate assessment 


0.8 


0.7 


0.7 


0.5 


0.9 



Source: For enrollments, U.S. Department of Education, National Center for Education Statistics, 2007 (http://nces.ed.gov/ccd/); for Arkansas, alternate 
assessment, C. Marvel, Math and Assessment Specialist, Arkansas Department of Education, June 2007 and Tom Hicks, Special Projects, Arkansas Depart- 
ment of Education, June 2007; for Louisiana alternate assessment numbers, J. Johnson, Education Consultant, Louisiana Department of Education, June 
2007; for New Mexico alternate assessment numbers, D. Farley, Education Consultant, Special Education Bureau at the New Mexico Public Education 
Department, June 2007; for Oklahoma alternate assessment numbers, A. Daugherty, Coordinator, Compliance Activities and Assessment Special Education 
Department, Oklahoma Department of Education, June 2007; for Texas alternate assessment numbers, C. Wieland, Director, Special Education Assessments, 
Texas Education Agency, survey comment, 2007. For enrollments, state report cards for Arkansas (http://normessasweb.uark.edu/reportcards/state05. 
php), Louisiana (http://www.doe.state.la.us/lde/pair/StateReport0405/10-Student_Achievement.pdf), New Mexico (http://www.ped.state.nm.us/div/ 
acc.assess/accountability/dlRptCard2005/NMStateReportCard%20English.pdf), Oklahoma (http://title3.sde.state.ok.us/studentassessment/2005results/ 
reportcard2005state.pdf), and Texas (http://www.tea.state.tx.us/research/pdfs/2005_comp_annual.pdf). 



documents describing the purposes of alternate 
assessments for students with significant cogni- 
tive disabilities (see table C2 in appendix C). All 
five states share the intent of the No Child Left 
Behind Act and the Individuals with Disabilities 
Education Act of 2004 to meet federal and state 
requirements and to provide accountability data 
for programs; to provide instructional information 
to teachers, parents, and students; and to improve 
student outcomes. 

All five Southwest Region states have created 
definitions and guidelines for participation in 
alternate assessments, as required by the Indi- 
viduals with Disabilities Education Act of 2004 
(see tables C3 and C4 in appendix C). To meet the 
requirements of this act — and those of the No 
Child Left Behind Act — all five states have passed 
laws, regulations, rules, state board policies, or 
administrative code (nomenclature varies by 
state; see table C2 in appendix C). All five states 
describe their alternate assessments as intended 
for students who are unable to participate in 
state and district assessments, even with accom- 
modations. This description matches the acts’ 
requirements. 



Where states vary is in their working definition 
of significantly cognitively disabled, their criteria 
for participation in alternate assessments, and 
the status of the peer review process — Arkansas 
and Oklahoma have received ratings of full 
approval, New Mexico has received approval 
expected, and Louisiana and Texas have received 
approval pending. States are addressing the 
second and third purposes, to improve programs 
and instruction, by shifting from a solely func- 
tional curriculum to a functional curriculum 
blended with academics and linked to academic 
content standards. The result: a notable shift of 
curricular philosophy, as teachers report spend- 
ing more time on more academically based 
instruction because of alternate assessment 
(Browder et al., 2005). 

Alternate assessment approach. When the Indi- 
viduals with Disabilities Education Act of 1997 
was passed, states moved quickly to develop 
alternate assessments. At that time Louisiana and 
New Mexico used checklists to assess students. 
Texas used locally selected alternate assessments 
and also produced the State-Developed Alternate 
Achievement I, a multiple- choice assessment. 
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TABLE 4 

Southwest Region state alternate assessment approaches for 2007/08 





Arkansas 


Louisiana 


New Mexico 


Oklahoma 


Texas 


Approach 


Portfolio 


Performance 


Performance 


Portfolio 


Portfolio 


Program title 


Arkansas 

Alternate Portfolio 
Assessment System 
(AAPAS) 


LEAP Alternate 
Assessment 
Program (LAA 1) 


New Mexico 

Alternate 

Performance 

Assessment 

(NMAPA) 


Oklahoma Alternate 
Assessment 
Program (OAAP) 


Texas Assessment 
of Knowledge and 
Skills— Alternate 
(TAKS-ALT) 


Grades 


3-8, 9 (for math), 11 


3-8, 10 


3-10 


3-8, 10-12 


3-10 


Subject 


English language 
arts, math, science 


English language 
arts, math, science, 
social studies 


English language 
arts, math, science, 
writing 


Reading, writing, 
math, science, 
social studies 


Reading (3-9) 
math (3-10) 
Writing (4 and 7) 
English language 
arts (10) 



Source: Arkansas Department of Education web sites (http://www.arkansased.org/students/assessment.html; arkedu.state.ar.us/actaap/index.htm; http:// 
www.arkedu.state.ar.us/actaap/student_assessment/student_assessment_p1.htm) and survey and interviews with department contacts (C. Marvel and T. 
Hicks); Louisiana Department of Education web sites (http://www.doe.state.la.us/lde/saa/2273.html; http://www.doe.state.la.us/lde/accountability/home. 
html; www.doe.state.la.us/lde/saa/2343.html) and survey and interviews with department contact (J. Johnson); New Mexico Public Education Department 
web sites (http://legis.state.nm.us; http://www.ped.state.nm.us/div/acc.assess/assess/info.update.corner.html; http://www.ped.state.nm.us/div/acc.assess/ 
assess/index.html) and survey and interviews with department contact (D. Farley); Oklahoma State Department of Education web sites (http://www.sde. 
state.ok.us/home/defaultie.html; http://www.lsb.state.ok.us; title3.sde.state.ok.us/studentassessment) and survey and interviews with department contact 
(A. Daugherty); Texas Education Agency web sites (http://www.tea.state.tx.us/student.assessment; http://www.tea.state.tx.us/curriculum.html; http://www. 
legis.state.tx.us; http://www.sos.state.tx.us/tac/index.shtml; http://www.tea.state.tx.us/student.assessment/resources/taksalt/index.html) and survey and 
interviews with agency contact (C. Wieland). 



Arkansas and Oklahoma were using a body of 
evidence (portfolio) alternate assessment. 

All five states in the Southwest Region reported that 
they were planning to use portfolios or performance 
tasks in the 2007/08 school year (see tables Cl, C5, 
and C8 in appendix C). Arkansas, Oklahoma, and 
Texas were to use portfolios, and Louisiana and 
New Mexico were to use performance tasks (table 
4). In addition, the states are focusing their efforts 
on standards-based assessment in reading, math, 
and science. Louisiana and Oklahoma assess social 
studies and science. New Mexico and Texas include 
writing (see table Cl in appendix C). 

The Southwest Region states differ greatly in 
selecting tasks to be administered as part of the 
alternate assessment (see table C5 in appendix 
C). Arkansas and New Mexico predetermine 
standards and tasks to be assessed in all content 
areas. In Oklahoma and Texas the state predeter- 
mines two academic standards to be assessed and 
teachers choose three additional standards to be 
measured. In Louisiana teachers may determine 



what is assessed based on targeted indicators, 
and required state tasks are combined with those 
developed locally. It should be remembered that 
validity and reliability questions arise when teach- 
ers are allowed a choice of performance tasks, as 
they are in some Southwest Region states (see table 
C7 in appendix C). 

Each state reports clear evidence of a shift in state 
alternate achievement standards and performance 
descriptors — from the functional to the academi- 
cally focused. But the academic focus remains 
narrow. The five states have indicated that they 
need technical assistance to define the right blend 
between functional and academic standards and 
instruction (including performance descriptors 
that reflect the blend) to train teachers. 



What training or professional development is 
provided for teachers on alternate assessments? 

State contacts reported on what the five states are 
doing to train teachers and on each state’s per- 
ceived needs in this area. 
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Current training efforts. Each state has its own 

procedures for training teachers to administer and 
interpret the results of alternate assessments (see 
table C6 in appendix C). Arkansas uses a regional 
train-the-trainer system. State and regional staff 
provide training to district staff, who then train 
teachers in their districts and schools how to 
administer alternate assessments, how to teach 
students with cognitive disabilities, and how to 
interpret score reports. Texas also uses a regional 
train-the-trainer model. State education agency 
staff and a contractor train the trainers in 20 
regional centers. Participants are trained on par- 
ticipation guidelines, the grade-level curriculum, 
how to record data, how to score, and how to use 
the online system. The trainers then train local 
education agency staff and special school staff. 
Texas offers the training modules online. Regional 
centers across Texas offer training on related top- 
ics such as assistive technology and instructional 
modules. The Texas Education Agency sponsors an 
annual assessment conference that offers educa- 
tors additional training on alternate assessment. 
Louisiana also uses a train-the-trainer model, with 
topics including how to administer the assessment 
and how to interpret and use results. 

In New Mexico Public Education Department 
staff members and professors of special educa- 
tion from the University of New Mexico provide 
an annual, intensive three-day training session 
in three modules (an overview, what is measured, 
and scoring) for alternate assessments. Teachers 
watch a video of students taking the assessment 
and then score case studies. Web-based training 
is offered for support. Teachers must pass a test 
to qualify to administer the alternate assessment. 
More than 1,200 teachers have been trained. Every 
other year teachers are trained on best practices of 
instruction. 



Several training needs 
were commonly 
identified by state 
contacts in the 
Southwest Region 



In Oklahoma Department of 
Education staff and a contractor 
train teachers in the state’s five 
regions to administer the as- 
sessment and interpret the score 
report. 



Training needs. The following training needs were 

among those most commonly identified by state 

contacts in the Southwest Region: 

• Training on definitions and guidelines for 
IEP teams. Because individualized educa- 
tion programs are responsible for connect- 
ing present levels of performance to annual 
education goals and assessments, more com- 
prehensive training would likely improve 
decisionmaking. According to state contacts, 
teams need training on how to understand 
participation guidelines (including how to 
integrate present levels of performance and 
adaptive skills into participation criteria), 
how to select accommodations, and how to 
fit alternate assessments into state assess- 
ment systems. 

• Training for teachers on how to link functional 
skills and academics to alternate achievement 
standards. Some teachers lack background 
knowledge for aligning functional skills to 
the curriculum and standards. State contacts 
report a need for uniform training across 
districts. One needed skill is how to link the 
goals and objectives of an individualized 
education program to the alternate achieve- 
ment standards. Another is how to identify 
and develop functional and academic skills 
that emphasize a broader generalization of 
life skills and that are related to meaning- 
ful outcomes (such as inclusive learning, 
postsecondary experiences, vocational and 
employment opportunities, and community 
participation). 

• Training for teachers on how to access the 
general education curriculum. State contacts 
suggest a need for training in effective class- 
room practices (explicit instruction, differen- 
tiated instruction, peer-meditated instruction, 
classroom management, and universal design) 
and curriculum enhancements (curriculum 
modifications, curriculum accommodations, 
graphic organizers, text enhancements, and 
computer-based lesson enhancements). 
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• Training for teachers and IEP teams on how 
to align assessment and instruction. Because 
knowledge of assessment practices and of how 
to interpret results is uneven, training could 
improve individualized education programs 
and student outcomes. State contacts indicate 
a need for structured frameworks that link 
performance standards and indicators to the 
curriculum and that integrate assessment 
tasks into daily instruction — training teachers 
to map instructional tasks to content stan- 
dards (to assess curriculum and instruction), 
to monitor curricular breadth and depth, and 
to understand alignment models as they apply 
to students with disabilities. 

• Training for all staff on how to use data to in- 
form instruction. Many states are struggling in 
this area. State contacts suggested that provid- 
ing states with models or support (technical 
assistance or training) on how to employ data- 
driven activities would allow them to pass this 
information on to local practitioners and could 
potentially result in feedback on their instruc- 
tion. For example, school staff could receive 
training on identifying, collecting, and evaluat- 
ing the appropriate data to modify or develop 
instruction and to use data for reporting. 

State contacts also identified the technical assis- 
tance they would like. 

• Help in developing fair, bias-free assessments 
for students with significant cognitive disabili- 
ties. State contacts would like training in prac- 
tices that have been successful in other states 
and in methods to demonstrate lack of bias 
and validity for alternate assessments. 

• Help in operationalizing definitions and 
alternate assessment participation guidelines. 

State contacts suggest that a written process 
or step-by-step guide would help ensure a fair 
and uniform inclusion process. 

• Technical assistance on linkage and align- 
ment between standards, assessment, and 



instruction. States can receive technical as- 
sistance (for example, through dissemination 
from expert sources) to build their under- 
standing of this linkage. 

• Help in writing descriptors for various levels of 
performance. This part of the assessment pro- 
cess has proven challenging for many states. 
State contacts suggest that training (including 
discussion of content standards, performance 
levels, the test, and expectations for students) 
could be set in the context of standard setting 
and that models from other states and pro- 
grams could be provided. 

• Help in identifying, evaluating, selecting, and 
implementing best practices for alignment, 
scoring, rubrics, and standard setting. State 
contacts indicate that states need such help 
(or general training) to build their alternate 
assessments or improve current practices. 

• Help in collecting and reporting evidence for 
validity, reliability, and usability — including 
developing technical reports. States must 
demonstrate the technical adequacy of their 
assessments for a variety of legal, technical, 
and ethical reasons, including passing the No 
Child Left Behind peer review process. State 
contacts indicate that of several possible train- 
ing approaches, the most efficient might be 

to provide models from other more advanced 
programs. In addition, they suggest that states 
can be directed to web sites or organizations 
that provide technical assistance or to publica- 
tions, manuals, and other sources of needed 
guidance. 

• Guidance on using assessment data to inform 
and improve instruction. According to state 
contacts, both state and local education agen- 
cies would benefit from such guidance, which 
could include how to develop alternate assess- 
ments that best produce measurable results 
and how data drive accountability. They say 
that training or other dissemination resources 
on current accountability and school reform 
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issues would help drive the development of 
aligned alternate assessments. 



All five state contacts highlighted the needs for 
teacher training and technical assistance. Many 
teachers and state department of education staff 
need to be trained on functional academics or on 
how to use the right blend of functional skills and 
academics in teaching (Almond & Bechard, 2005; 
Towles-Reeves et al., in press). 



Many teachers and staff 
need to be trained on 
functional academics 
or on how to use the 
right blend of functional 
skills and academics 



In addition to ascertaining the right blend of func- 
tional skills and academic skills, Browder, Fallin, 
et al. (2003) emphasize the need to train teachers 
in aligning instruction with assessment and in 
using results. Improving alternate assessment 
practices also requires understanding teachers’ 
perspectives and considering how they have been 
trained for alternate assessments. Kim et al. (2006) 
found that teachers had a negative perception of 
their alternate assessment system — in part be- 
cause, by their own report, they had only a limited 
general understanding of large-scale assessment 
systems and the interface between these systems 
and academic instruction. According to Kleinert, 
Kennedy, and Kearns (1999), although teachers felt 
positively about students with disabilities being 
included in large-scale assessment systems, they 
reacted negatively to the amount of time spent 
completing and implementing the portfolio, and 
they questioned the reliability of scores. Kampher, 
Horvath, Kleinert, and Kearns (2001) posit that 

teachers who felt time constraints 

completing and implement- 
ing the portfolio reached their 
(self-reported) negative percep- 
tions primarily because they had 
not been adequately trained and 
prepared to align the curriculum 
to the assessment. 



All state contacts suggested that additional re- 
search focus on the following questions: 

• Are teachers increasingly part of a long-term 
professional training program that improves 
instruction in math or reading? 



• Are students increasingly meeting the stan- 
dards because of focused instruction in math 
or reading? 



How are results collected and used at the 
state, district, school, and student levels? 

States are using the 1 percent rule in reporting 
alternate assessment results for adequate yearly 
progress. Because all scores must be included 
in adequate yearly progress calculations, any 
alternate assessment scores of “proficient” and 
“advanced” for students with the most significant 
cognitive disabilities that exceed the 1 percent cap 
must be counted as nonproficient (U.S. Depart- 
ment of Education, 2005, p. 7). States can request a 
slightly higher cap. When making such a request, 
states must adhere to strict eligibility criteria and 
address several issues, including (U.S. Department 
of Education, 2003b, p. 2): 

• Incidence rates of students with the most 
significant cognitive disabilities. 

• Circumstances in the state that would explain 
the higher incidence rates (such as specialized 
health programs or facilities). 

• Documentation showing that the state has 
implemented safeguards to limit the inappro- 
priate use of alternate assessments. 

States may also grant districts an exception allow- 
ing them to exceed the 1 percent cap. 

The authors of the present study verified the ap- 
plication of the 1 percent rule by looking at state 
assessment data submitted to the U.S. Depart- 
ment of Education, as reported by Thurlow, Moen, 
and Altman (2006). For 2003/04 Arkansas, New 
Mexico, and Oklahoma were within the 1 percent 
cap for reading and math. Louisiana exceeded 
the cap by 16 students for high school, report- 
ing 1.36 percent for reading and 1.16 percent for 
math at proficient and above. Texas exceeded 
the cap significantly for reading (5.54 percent 
for elementary, 5.70 percent for middle school, 
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and 5.03 percent for high school) and math (4.61 
percent for elementary, 4.87 percent for middle 
school, and 5.01 percent for high school). Ac- 
cording to Texas’s reported percentages, the state 
exceeded the cap by an average of 13,500 students 
per grade. 

States must identify the content to be included on 
alternate assessments and must decide how mas- 
tery of this content is defined. To define mastery 
the five states in the Southwest Region undertook 
standard setting procedures for their alternate 
assessments (see appendix A and table C7 in ap- 
pendix C). 

Because states define their own performance levels 
(such as advanced, basic, below basic), compari- 
sons across states can be difficult. That said, this 
study’s findings replicate a national pattern of 
significant differences across states for the number 
and percentage of students scoring proficient or 
above on alternate assessments (National Center 
on Educational Outcomes, 2005). The authors of 
this study examined evidence from the 2005/06 



school year because these were the data submitted 
for the peer review process. Table 5 summarizes 
the percentages of students at proficient or above 
for mathematics and reading. 

In the state report cards for 2005/06 some states 
report large differences between the percentage 
of students scoring proficient or above on the 1 
percent alternate assessment and on the general 
education criterion-referenced test (table 6; see 
also table Cl in appendix C). 

Differences between proficient scores on general 
and alternate assessments cannot be attributed 
solely or primarily to instructional practices. 

States define their own achievement levels. And 
the peer review results — as well as ongoing revi- 
sions to assessment approaches in each state — in- 
dicate that the technical quality of states’ alternate 
assessments has not been fully demonstrated. 
States must monitor alternate assessment admin- 
istration and scoring more closely to ensure that 
procedures are followed uniformly and objectively. 
Further, states must explain any differences in 



TABLE 5 

Students scoring at proficient or above on alternate assessments in 
math and reading by grade and state, 2005/06 (percent) 



Grade 


Arkansas 

Math Reading 


Louisiana 
Math Reading 


New Mexico 
Math Reading 


Oklahoma 
Math Reading 


Texas 

Math Reading 


3 


67 


57 


50 


64 


30 


67 


88 


90 


98 


96 


4 


60 


61 


53 


75 


41 


67 


87 


88 


95 


91 


5 


50 


56 


60 


75 


36 


66 


83 


88 


94 


91 


6 


57 


59 


68 


80 


46 


70 


84 


88 


88 


88 


7 


50 


53 


64 


76 


53 


77 


71 


74 


83 


84 


8 


44 


66 


67 


80 


48 


73 


79 


81 


83 


86 


9 


— 


— 


— 


— 


49 


78 


— 


— 


74 


80 


10 


— 


— 


68 


77 


50 


74 


75 


76 


78 


80 


11 


— 


— 


— 


— 


— 


— 


66 


66 


— 


— 


12 


— 


— 


— 


— 


— 


— 


63 


40 


— 


— 



— is not available or no assessment at this grade. 

Source: For math percentages: in Arkansas, Arkansas Department of Education, 2006c ; in Louisiana, Louisiana Department of Education, 2007; in New 
Mexico, New Mexico Public Education Department 2007a; in Oklahoma, Oklahoma State Department of Education, 2006; and in Texas, Texas Education 
Agency, 2007. For reading percentages: in Arkansas, Arkansas Department of Education, 2006c ; in Louisiana, Louisiana Department of Education. 2007 ; in 
New Mexico, New Mexico Public Education Department 2007b; in Oklahoma, Oklahoma State Department of Education, 2006; for Texas, Texas Education 
Agency, 2007. 
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TABLE 6 

Students scoring proficient or above for general and alternate assessments in math 
and reading in grades 4 and 5 in the Southwest Region states, 2005/06 



Assessment and grade 


Arkansas 


Louisiana 


New Mexico 


Oklahoma 


Texas 


Math 


General assessment 


4 


50 


61 


36 


— 


— 


5 


— 


— 


— 


74 


81 


Alternate assessment 


4 


— 


53 


41 


— 


— 


5 


50 


— 


— 


83 


94 


Reading 


General assessment 


4 


— 


— 


— 


— 


— 


5 


50 


64 


55 


77 


81 


Alternate assessment 


4 


— 


— 


— 


— 


— 


5 


56 


75 


66 


88 


91 



— is not available or no assessment at this grade. 

Source: State report cards for Arkansas (http://normessasweb.uark.edu/reportcards/state05.php), Louisiana (http://www.doe.state.la.us/lde/pair/ 
StateReport0405/1 0-Student_Achievement.pdf), New Mexico (http://www.ped.state.nm.us/div/acc.assess/accountability/dlRptCard2005/NMStateReportCard 
%20English.pdf), Oklahoma (http://title3.sde.state.ok.us/studentassessment/2005results/reportcard2005state.pdf), and Texas (http://www.tea.state.tx.us/ 
research/pdfs/2005_comp_annual.pdf) 



proficiency rates in scores on general and alternate 
assessments. 

Finally, the No Child Left Behind Act and the In- 
dividuals with Disabilities Education Act of 2004 
require that assessment systems collect data and 
provide reports at the student, class, school, dis- 
trict, and state levels. A 2005 report on the status 
of the states in special education by the National 
Center on Educational Outcomes found that states 
were complying with these requirements for ac- 
countability purposes (Thompson et al„ 2005). The 
authors of the present study, however, have noted 
that states differ greatly in the data they report, in 
how they report it, and in how clearly they present 
information across the required levels (student, 
class, school, and local education agency). Differ- 
ent states use different cells and different lan- 
guage. For example, New Mexico’s student reports 
show a performance level for each standard 
measured. But the other states report chiefly at the 
content-strand level. 



To what extent do state alternate assessments 
capture the same or similar skills as state tests 
designed for the general student population? 

State systems of standards and assessment pro- 
vide useful information for valid accountability 
decisions and education improvement only to 
the degree that all components are aligned or 
linked. State assessments must be linked to state 
standards and — ultimately — to instruction. The 
federal peer review process requires each state 
to present evidence that its assessment system is 
aligned or linked to its standards and to submit 
the evidence from alignment studies and plans 
to fill any coverage gaps (National Alternate 
Assessment Center, 2005; U.S. Department of 
Education, Office of Elementary & Secondary 
Education, 2004). 

Some states generate alignment evidence in the 
test development process and document the steps 
that they have taken to ensure linkage (such as 
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mapping items to standards). Some states have 
prepared documents and training to link assess- 
ment with instruction. But studies of alternate 
assessment s linkage to general education content 
standards and of their alignment with alternate 
content standards are needed (Lewis et al„ 1996; 
Roach & Elliott, 2004). Three of the Southwest 
Region states have to submit alignment stud- 
ies for the 2005/06 year, identify any gaps, and 
outline a plan to fill them (U.S. Department of 
Education, 2006a). 

Four states used, or are using, the Webb method- 
ology for alignment. The Webb alignment model 
uses content experts to measure the degree of 
correspondence between state-level standards 
and assessments (Ananda, 2003). The criteria 
for measuring linkage are (Wisconsin Center for 
Educational Research, 2007): 

• Categorical concurrence— the extent to which 
the same, or consistent, content categories ap- 
pear in standards and assessments. 

• Range of knowledge correspondence — 
whether the span of knowledge expected of 
students on the basis of a standard corre- 
sponds to the span that they need to answer 
the corresponding assessment items or activi- 
ties correctly. 

• Balance of representation — whether objectives 
that fall under a specific standard are given 
relatively equal emphasis on the assessment. 

• Depth of knowledge consistency — the extent 
to which the knowledge elicited from students 
on the assessment is as complex within the 
content area as what students are expected to 
know and do. 

Louisiana used the WestEd methodology (WestEd, 
2004), which uses the Webb review categories (see 
table C 7 in appendix C). 

All Southwest Region states reported that they 
are just beginning to validate the linkage of their 



Representatives from 
all five states recognize 
a need for empirical 
evidence to demonstrate 
the relationships among 
standards, assessment, 
and instruction 



alternate assessments 
with instruction. Loui- 
siana, New Mexico, and 
Texas needed to submit 
reports on alignment 
studies and to show how 
they would fill any gap. 

Texas had to submit its 
standards, which it has 

now completed. Accurate data about the conse- 
quences of the assessment system take time to 
collect, so states such as Louisiana and Texas will 
not have these data available at least until 2008. 
New Mexico was expected to have study results 
in the fall of 2007, but data were not available at 
the time of writing (D. Farley, Education Consul- 
tant, Special Education Bureau at the New Mexico 
Public Education Department, personal com- 
munication, June 27, 2007; C. Wieland, personal 
communication, 2007). 



Representatives from all five states recognize 
the need for empirical evidence and guidance to 
demonstrate the relationships among standards, 
assessment, and instruction. State contacts in 
Arkansas and New Mexico pointed out that this 
is an ongoing process. If a content area is added — 
as Arkansas has added high school science — the 
process must be repeated. If standards change— as 
they have in New Mexico and Oklahoma — the 
linkage must be reexamined. 

Four of the five Southwest Region states (all but 
Oklahoma) participate in the Council of Chief 
State School Officers Assessing Special Education 
Students State Collaborative on Assessment and 
Student Standards, where some of these issues are 
discussed. 



All states can participate in the Inclusive Assess- 
ment and Accountability Community of Practice, 
sponsored by the National Alternate Assessment 
Center. This community can help states working 
on their technical adequacy learn about the types 
of evidence to collect, the questions to ask, and the 
resources available to support objective linkage 
studies. 
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What technical issues are states facing in developing and 
implementing reliable and valid alternate assessments? 



The No Child Left Behind Act says that the “require- 
ments for high technical quality . . . including valid- 
ity, reliability . . . and consistency with nationally 
recognized professional and technical standards 
apply to alternate assessments as well as to regular 
State assessments” (Rules and Regulations, 68, p. 
68609. Fed. Reg. 236, December 9, 2003). States and 
researchers are wrestling with how to demonstrate 
the validity, reliability, and usability of alternate 
assessment systems. Louisiana, New Mexico, and 
Texas are working to address this question. Ar- 
kansas and Oklahoma will become increasingly 
engaged in doing so as they add high school science 
(Arkansas) and redefine their alternate academic 
standards (Oklahoma; see table C8 in appendix C.) 



In the survey and interviews that inform this 
study, representatives of all five Southwest Region 
states openly discussed technical challenges (see 
appendix C). As states conduct new alignment 
studies and standard setting activities, they are 
required to demonstrate the alignment or linkage 
of alternate assessment achievement standards 
with grade-level content standards and alternate 
assessments — a challenging task (especially when 
states allow teachers to select the content stan- 
dards being measured or the tasks for assessment). 
Because of changes to content standards, achieve- 
ment standards, and assessment approaches in the 
past several years, three states lack continuous, 
year-to-year trend data that could be used to mea- 
sure progress and growth. All three are exploring 
ways to conduct consequential validity studies to 
explore policies’ positive and negative effects. 



Some states are 
attempting to increase 
the technical adequacy 
of their alternate 
assessments with 
strategies that go 
beyond traditional 
statistical analyses 



All five Southwest Region states 
report combinations of raw scores, 
transformed scores, and scores 
based on achievement levels. A 
few states outside the region (such 
as Kentucky, North Carolina, 
Oregon, and South Carolina) 
have used the assessment triangle 
(see appendix A) for validating 



alternate assessments for students with significant 
cognitive disabilities (Kleinert, Browder, & Towles- 
Reeves, 2005; see also Pellegrino et al., 2001). Three 
states — Louisiana, New Mexico, and Texas — still 
had to submit evidence on the technical adequacy, 
alignment, inclusion, and reporting of their full 
alternate assessment system to federal peer review 
(U.S. Department of Education, 2006a). Arkansas 
submitted additional evidence and was given full 
approval in December 2006 (U.S. Department of 
Education, 2006a). The information presented in 
appendix C suggests that peer review categories are 
the critical consideration in developing alternate 
assessments that meet the challenges of the No 
Child Left Behind Act and assessment standards. 

A review of technical documents indicates that all 
states need better documentation of their efforts 
to demonstrate validity, reliability, and usability. 
Much work is needed to ensure that alternate as- 
sessments reflect adequate psychometric prop- 
erties and instructional relevance for students 
with significant cognitive disabilities. And more 
research is needed to validate the alternate ap- 
proaches being used for students with significant 
cognitive disabilities. 

Because states in the Southwest Region have not 
used the same approach or standards for three 
years in a row, researchers cannot look at data 
trends to show student progress. Further, three 
states — Louisiana, New Mexico, and Texas — have 
fallen short in providing evidence for content, 
response process, and various types of validity. 
Contacts in all five Southwest Region states said 
that since each state needs to continually address 
questions of validity, reliability, and usability, 
technical assistance from researchers or advisory 
committees looking at technical adequacy and 
reviewing technical manuals would help. 



FAR-REACHING QUESTIONS, EXISTING 
APPROACHES, FURTHER RESEARCH 

The states in the Southwest Region are grappling 
with the same basic technical challenges and 
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tradeoffs as states in other regions. All states con- 
front two far-reaching questions about alternate 
assessments: 

• Do the advantages of flexibility (tailoring 
assessments to the wide range of needs of 
students with significant cognitive disabili- 
ties) outweigh the technical challenges of such 
a nonuniform approach? 

• Does the possibility of attaining greater 
consequential validity through alternate as- 
sessments balance the difficulty of obtaining 
evidence for sufficient technical adequacy (at 
least as adequacy is traditionally defined)? 

These questions are researchable. 

Some states are attempting to increase the techni- 
cal adequacy of their alternate assessments with 
a range of strategies that go beyond traditional 
statistical analyses. Such broad-based approaches 
supplement technical analysis with careful 
explanation of the assessments’ purposes and 
procedures, mandatory uniform training for all 
administrators and scorers, and active monitoring 
of live administrations (McGregor, 2007). Another 
promising approach involves combining data 
across several years to increase sample sizes and 
better measure reliability and validity using exist- 
ing methodologies. 



criteria for their placement? Much current 
research points to a need to better identify 
who these students are— to prevent states 
from overidentifying them for the alter- 
nate assessment, but also to enable states to 
develop and tailor alternate assessments to 
them. 



Can appropriate guidelines be developed to 
justify IEP team decisions on alternate assess- 
ment? More data should be collected on the 
types of guidelines that the teams must follow 
in determining whether a student qualifies. 
Since guidelines and criteria vary from state 
to state, research 
is needed to get a 
better sense of which 
students are being 
served, the numbers 
being served, and the 
efficacy of eligibility 
requirements. 



For some of these 
questions the answers 
might ultimately lie in 
the policy and values 
arenas rather than in 
the technical realm 



Should assessment approaches (portfolio 
and performance based) be redefined for the 
Southwest Region states? For states across 
the country? Although states have had to 
use some form of alternate assessment for 
seven years, comprehensive research into the 
relationships between assessment formats and 
student outcomes remains sparse. 



Several states have joined an assessment consor- 
tium to increase available resources, expertise, and 
sample sizes for assessments of English language 
proficiency. Such collaboration requires upfront 
planning and — possibly — compromising on an 
assessment’s content and the range of purposes for 
which it may be used. Otherwise, the test may not 
be valid (aligned and accessible) for each state in 
the consortium. 

Several questions merit further research attention: 

• What are the characteristics of students with 
significant cognitive disabilities, and is it 
possible to validate the characteristics and 



• Efow expert are teachers in linking functional 
skills to content standards? Some research- 
ers have asserted that once teachers are well 
versed in this area, more checks and balances 
will be added to the accountability picture 
(Browder et al., 2004). More must be learned 
about the alignment of functional skills and 
content standards and teacher training, as 
very little is known about teachers’ perceived 
skill levels in this area. 

• Efow is the impact of alternate assessment 
policies and practices best measured? Be- 
cause of national variations, more research is 
needed here. 
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• Are states validly, reliably, and accurately 
measuring student performance? 

• Are teachers increasingly part of long-term 
professional development programs that 
improve instruction in reading? In math? In 
science? Some states already provide training 
in these areas. 

• Are students increasingly meeting the stan- 
dards because of focused instruction in read- 
ing? In math? In science? 

• Is the assessment triangle usable and beneficial 
for alternate assessments? What is its efficacy? 



• Do alternate assessments reflect robust 
psychometric properties, instructional 
relevance, and technical adequacy? Explor- 
ing these three areas is a longstanding 
challenge. Each area needs to be examined 
separately to meet federal guidelines and to 
improve alternate assessments and out- 
comes for students with significant cognitive 
disabilities. 

For some of these questions the answers might 
ultimately he in the policy and values arenas 
rather than in the technical realm — though new 
approaches to validation must be developed and 
studied. 
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APPENDIX A 
GLOSSARY 

1 percent rule. When measuring adequate yearly 
progress, states and school districts have the flex- 
ibility to count the “proficient” and “advanced” 
scores of students with the most significant cogni- 
tive disabilities who take alternate assessments 
based on alternate achievement standards — as 
long as the number of proficient and advanced 
scores so counted does not exceed 1 percent of all 
students in the grades tested (about 9 percent of 
students with disabilities). The 1 percent cap is 
based on current incidence rates of students with 
the most significant cognitive disabilities, allowing 
for reasonable local variation in prevalence. (U.S. 
Department of Education, 2003a) 

2 percent rule. Under the final regulations on 
modified academic achievement standards, when 
measuring adequate yearly progress, states and 
local education agencies have the flexibility to 
count the “proficient” and “advanced” scores of 
certain students — a small group who are identified 
as disabled and take alternate assessments based 
on modified academic achievement standards, but 
are not identified as having the most significant 
cognitive disabilities — so long as the number of 
proficient and advanced scores so counted does 
not exceed 2 percent of all students in the grades 
assessed (or about 20 percent of students with dis- 
abilities). The 2 percent cap, in conjunction with 
the requirements for state guidelines, is meant 

to discourage the inappropriate assessment of 
students based on modified academic achievement 
standards. (U.S. Department of Education, 2007) 

Accommodations. A change in the administration 
of an assessment (such as setting, scheduling, tim- 
ing, presentation format, response mode, or oth- 
ers, including any combination of these) that does 
not change the construct intended to be measured 
by the assessment or the meaning of the result- 
ing scores. Accommodations are used for equity, 
not advantage, and serve to level the playing field. 
To be appropriate, assessment accommodations 
must be identified in the student’s individualized 



education program or an accommodation plan 
under Section 504 of the Rehabilitation Act of 
1973 and used regularly during instruction and 
classroom assessment. (Policy to Practice Study 
Group, 2003) 

Accountability. The use of assessment results and 
other data to ensure that schools are moving in de- 
sired directions. Common elements include stan- 
dards, indicators of progress toward meeting those 
standards, data analysis, reporting procedures, 
and rewards or sanctions. (Policy to Practice Study 
Group, 2003) 

Adaptations. A general term that describes a 
change in the presentation, setting, response, tim- 
ing, or scheduling of an assessment that may or 
may not change the construct of the assessment. 
(Policy to Practice Study Group, 2003) 

Adequate yearly progress. A provision of the 
federal No Child Left Behind legislation requiring 
schools, districts, and states to demonstrate, on 
the basis of test scores, that students are mak- 
ing academic progress. Each state was required 
to submit by January 31, 2003, a specific plan for 
monitoring adequate yearly progress. (Policy to 
Practice Study Group, 2003) 

Alignment. Alignment refers to the degree to 
which the content (such as skills and concepts) 
in two sets of standards or in an assessment and 
set of standards concurs. Alignment relationships 
tend to be direct relationships (skill and content 
matches) and are typically observed between stan- 
dards and assessments for a single student popula- 
tion (such as general education, special education, 
or English language learners). (WestEd, 2004) 

Alternate assessment. An instrument used to 
gather information on the standards-based per- 
formance and progress of certain students, such 
as those whose disabilities preclude their valid 
and reliable participation in general assessments. 
Alternate assessments measure the performance 
of a relatively small population of students who 
are unable to participate in the general assessment 
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system, with or without accommodations, as de- 
termined by the individualized education program 
team. (Policy to Practice Study Group, 2003) 

Alternate achievement standard. An alternate 
achievement standard is an expectation of perfor- 
mance that differs in complexity from a general 
achievement standard. Alternate achievement 
standards must be aligned with a state’s academic 
content standards, promote access to the general 
curriculum, and reflect professional judgment of 
the highest achievement standards possible (see 
No Child Left Behind Act of 2001, §200. 1(d)). These 
standards will be considered during the peer 
review of each state’s standards and assessment 
system under the No Child Left Behind Act. (U.S. 
Department of Education, 2003a) 

Assessment triangle. An assessment framework, 
based on the premise that three foundational 
elements — cognition, observation, and interpre- 
tation — influence all functions of an assessment’s 
design and use and must work in synchrony to be 
effective (Pellegrino et al., 2001). 

The framework triangulates the balance and evi- 
dence needed among three areas: student cogni- 
tion (what do we know about how students learn?), 
observation (measurement, or how do we create 
situations that allow students to demonstrate what 
they have learned?), and interpretation (how do we 
draw inferences from the performance?). For in- 
stance, instead of raw scores, transformed scores, 
such as scaled scores (see “Scores” below), should 
be used for interpretation and decisionmaking 
(American Educational Research Association et 
al„ 1999). The performance of students receiving 
transformed scores can be compared with that of 
other students, using percentiles for example. 

The assessment triangle does not allow measure- 
ment of differentiated learning, a key concept in 
large-scale testing. 

Bias (test bias). In a statistical context bias is a sys- 
tematic error in a test score. In discussing test fair- 
ness, bias is created by not allowing certain groups 



into the sample, not designing the test to allow all 
groups to participate equitably, selecting discrimi- 
natory material, testing content that has not been 
taught, and so on. Bias usually favors one group of 
test takers over another, resulting in discrimina- 
tion. (Policy to Practice Study Group, 2003) 

Body of evidence Information or data establishing 
that a student can perform a particular skill or has 
mastered a specific content standard. The informa- 
tion or data were either produced by the student 
or collected by someone knowledgeable of the 
student. (Policy to Practice Study Group, 2003) 

Bookmark. An approach to standard setting where 
reviewers establish cut scores for specified levels of 
proficiency. (Olsen et al., 2002) 

Checklist/rating scale. In this alternate assessment 
approach teachers evaluate whether students can 
perform certain behaviors or have mastered cer- 
tain skills. Scoring is based on the number of skills 
the student is able to perform successfully. 

Cutscore. A specified point on a score scale. Scores 
at or above that point are interpreted differently 
from scores below that point. (Policy to Practice 
Study Group, 2003) 

Functional academics. Cognitive abilities and 
skills learned at school. The school subjects that 
directly apply to and teach the skills needed in 
one’s everyday environment. The idea behind 
functional academics is to implement and teach 
academic skills in a way that students can general- 
ize from one setting to the next, outside the school 
context. (Turnbull, Turnbull, Shank, Smith, & 

Leal, 2002) 

Functional skills. Functional skills are daily living 
skills that provide the essential knowledge, skills, 
and understanding to operate successfully, effec- 
tively, and independently in life and at work. The 
premise behind functional skills is that they allow 
an individual with a disability more access to 
opportunities for participation in the community. 
(Turnbull et al., 2002) 
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Individualized education program. A document 
that reflects the decisions made by an interdisci- 
plinary team, including the parent and the student 
when appropriate. During an individualized 
education program meeting for a student with a 
disability, the individualized education program 
team (IEP team) will identify the student’s abilities 
and disabilities. (Policy to Practice Study Group, 
2003) 

Large-scale assessment. A test administered simul- 
taneously to large groups of students within the 
district or state. (Policy to Practice Study Group, 
2003) 

Linkage. Relationships that tend to be develop- 
mental, foundational, or proximal and are typi- 
cally observed between standards and assessments 
developed for different populations (such as gen- 
eral education standards and alternate standards). 
(WestEd, 2004) 

Peer review. The use of state experts to review 
a state’s standards and assessment system to 
determine whether it meets No Child Left Behind 
requirements. 

Performance descriptors. Statements that describe 
what students at each performance level should 
know and be able to do. 

Performance levels. Performance levels provide 
a determination of the extent to which a student 
has met the content standards. They distinguish a 
proficient or adequate performance from a novice 
or expert performance. (Policy to Practice Study 
Group, 2003) 

Portfolio. A collection of student-generated or 
student-focused evidence that provides the basis 
for demonstrating the student’s mastery of a range 
of skills, performance level, or improvement in 
these skills over time. The portfolio evidence may 
include student work samples, photographs, vid- 
eotapes, interviews, anecdotal records, interviews, 
and observations. (Policy to Practice Study Group, 
2003) 



Prompt. A picture or text (for example, a word) to 
stimulate a response to an item on an assessment. 
(Salvia & Ysseldyke, 2001) 

Reliability. The consistency of the test instrument; 
the extent to which it is possible to generalize a 
specific behavior observed at a specific time by a 
specific person to observations of similar behavior 
at different times or by different behaviors. (Policy 
to Practice Study Group, 2003) 

Rubric. A scoring tool based on criteria used to 
evaluate a student’s test performance. The cri- 
teria contain a description of the requirements 
for varying degrees of success in responding to 
the question or performing the task. Rubrics can 
be diagnostic or analytic (providing ratings of 
multiple criteria), or they can be holistic (describ- 
ing a single global trait). (Policy to Practice Study 
Group, 2003) 

Scores (raw and scale). Raw scores are tradition- 
ally converted to scale scores (such as 200-600, 
with 400 the mean average) for various purposes. 
Using scale scores can support a testing program’s 
validity, simplify reporting, and compensate for 
variation in task or item difficulty and each item’s 
weighted importance during scoring. By using 
raw scores, testing programs limit their ability to 
compare student performance or show individual 
growth from one testing window to the next. Scale 
scores are often linked to cutscores (set during 
standard setting) and performance descriptors. 
Qualitative descriptions can be attached read- 
ily to ranges of scores to enhance interpretation. 
(American Educational Research Association et 
ah, 1999) 

Standard setting. Determining appropriate 
cutscores that correspond to specified levels of 
performance (such as below basic, basic, profi- 
cient, and advanced). In addition, during standard 
setting, descriptors are written indicating what 
students at each performance level should know 
and be able to do. Standard setting has an im- 
portant relationship to instruction because this 
information — which accompanies assessment 
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results — helps inform instructional planning for 
each student. (Roach & Elliott, 2004) 

Standardized. An established procedure that 
ensures that a test is administered with the same 
directions, under the same conditions (time limits 
and so on), and is scored in the same manner for 
all students to ensure reliable and valid compari- 
son among students taking the test. (Policy to 
Practice Study Group, 2003) 

Technical adequacy. The extent to which an assess- 
ment meets the requirements for validity, reliabil- 
ity, accessibility, objectivity, and consistency with 
nationally recognized professional and technical 
standards. Evidence for technical adequacy can 
include information on administration, scoring, 
interpretation, and technical data. (U.S. Depart- 
ment of Education, 2007) 

Technical assistance. Technical assistance services 
are timely, specialized guidance and customized 
supports that help states, districts, schools, and 
educators solve specific problems, and increase 
their capacity to improve student learning. Techni- 
cal Assistance can be short- or long-term (Dela- 
ware Department of Education, 2008). 

Validity. The extent to which a test measures what 
it was designed to measure (Policy to Practice 
Study Group, 2003). Common types of validity 
include: 

• Construct validity The extent to which the 
characteristic to be measured relates to test 



scores measuring the behavior in situations 
in which the construct is thought to be an 
important variable. 

• Content validity The extent to which the stim- 
ulus materials or situations composing the 
test call for a range of responses that represent 
the entire domain of skills, understandings, or 
behaviors that the test is intended to measure. 

• Convergent validity. The extent to which the 
assessment results positively correlate with 
the results of other measures designed to as- 
sess the same or similar constructs. 

• Criterion-related validity. The extent to which 
test scores of a group or subgroup are com- 
pared with other criterion measures (ratings, 
classifications, other tests) assigned to the 
examinees. 

• Face validity. A concept based on a judgment 
about how relevant the test items appear to be; 
it relates more to what a test appears to mea- 
sure than to what the test actually measures. 

Work sample. Work samples, as found in portfo- 
lios, are examples of student work collected over 
time. To allow scoring after a portfolio is com- 
pleted the results of work samples must be stored 
as artifacts. Examples of artifacts in portfolios are 
photographs or videotapes of the student perform- 
ing a task (such as placing pictures in sequential 
order), audiotapes (such as a student reading), 
writing samples, drawings, and tests. 
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APPENDIX B 

SOUTHWEST REGION STATE 
DEMOGRAPHIC INFORMATION 



TABLE B1 

State demographic statistics 



Statistic 


Arkansas 

Arkansas 
Department 
of Education 


Louisiana 

Louisiana 
Department 
of Education 


New Mexico 

New Mexico 
Public Education 
Department 


Oklahoma 

Oklahoma State 
Department 
of Education 


Texas 

Texas 

Education 

Agency 


K— 1 2 student enrollment 


463,115 


724,281 


326,102 


629,476 


4,405,215 


Number of school districts 


318 


68 


89 


541 


1,039 


Approximate number 
of students taking 


alternate assessment 


3,700 


4,500 


2,200 


3,100 


40,000 


Number of public schools 


1,158 


1,541 


842 


1,787 


8,746 


Elementary schools 


600 


831 


449 


1,020 


4,224 


Middle schools 


201 


245 


152 


294 


1,576 


High schools 


311 


303 


119 


483 


1,687 


Multilevel schools 


— 


156 


— 


— 


469 


Alternative schools 


5 


36 


27 


— 


714 


Career/tech schools 


— 


— 


— 


54 


— 


Charter schools 


17 


17 


44 


12 


321 


Student race/ethnicity (percent) 


Asian 


1 


1 


1 


2 


3 


Black 


23 


48 


3 


11 


14 


Hispanic 


6 


2 


53 


8 


45 


American Indian 


1 


1 


11 


19 


<1 


White 


69 


48 


32 


61 


38 


Students receiving free or 


reduced-price lunch (percent) 


52 


62 


58 


54 


48 


Students receiving special 


education services (percent) 


12 


14 


20 


15 


12 


Students receiving English 
language learner services 


(percent) 


4 


2 


19 


7 


16 


Graduation rate (percent) 


81 


83 


84 


86 


84 



— is not available. 

Source: State report cards for Arkansas (http://normessasweb.uark.edu/reportcards/state05.php), Louisiana (http://www.doe.state.la.us/lde/pair/ 
StateReport0405/1 0-Student_Achievement.pdf), New Mexico (http://www.ped.state.nm.us/div/acc.assess/accountability/dlRptCard2005/NMStateReportCard 
%20English.pdf), Oklahoma (http://title3.sde.state.ok.us/studentassessment/2005results/reportcard2005state.pdf), and Texas (http://www.tea.state.tx.us/ 
research/pdfs/2005_comp_annual.pdf); state department of education contacts (for Arkansas, Charlotte Marvel, Math and Assessment Specialist, Arkansas 
Department of Education; for Louisiana, Jeanne Johnson, Education Consultant, Louisiana Department of Education; for New Mexico, Dan Farley, Education 
Consultant, Special Education Bureau at the New Mexico Public Education Department; for Oklahoma, Amy Daugherty, Coordinator, Compliance Activities 
and Assessment, Special Education Department Oklahoma Department of Education; for Texas, Cari Wieland, Director, Special Education Assessments, Texas 
Education Agency. 
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APPENDIX C 

SIDE-BY-SIDE COMPARISON OF 
SOUTHWEST REGION STATES' ALTERNATE 
ASSESSMENT POLICIES AND PRACTICES 



TABLE Cl 

State assessments by grade and subject 





Arkansas 


Louisiana 


Comprehensive No Child Left Behind 


State tests: 


Comprehensive assessment system: 


Title 1 assessment system 


• Arkansas Comprehensive Testing and 
Accountability Program (ACTAAP). 

• Benchmark exams (criterion- 
referenced tests, grades 3-8). 

• End-of-course exams (algebra, 
geometry, literacy). 

• Iowa Tests of Basic Skills (ITBS; grades 
3-8) and Iowa Test of Education 
Development (ITED) at grade 9. 

• Arkansas Alternate Portfolio 
Assessment System (AAPAS) for 
students with significant cognitive 
disabilities. 


• iLeap (norm-referenced test 
augmented criterion-referenced test 
in grades 3, 5-7, 9). 

• Louisiana Educational Assessment 
Program (LEAP; criterion-referenced 
test in grades 4 and 8). 

• Graduation exit exam (GEE). 

• Louisiana alternate assessments (LAA 
1 and LAA 2). 


Alternate assessment 
system for 2007/08 


Arkansas Alternate Portfolio Assessment 
System (AAPAS) 


LEAP Alternate Assessment (LAA 1) 


Grades and subjects tested 


3-8, 9 (for math), and 11 

English language arts, math, science 


3-8, and 10 

English language arts, math, science, 
social studies 





APPENDIX C 



39 



1 New Mexico 


Oklahoma 


Texas 


New Mexico Achievement Assessment 


Oklahoma School Testing Program 


Texas Assessment Program (TAP): 


Program (NMAAP): 

• New Mexico Standards Based 


(OSTP) based on core curriculum 
(Priority Academic Student Skills — PASS): 


• Texas Assessment of Knowledge and 
Skills (TAKS and TAKS-ALT; grades 


Assessment (NMSBA, grade 3-9 and 


• Oklahoma Core Curriculum Tests 


3-12 in English and language arts, 


11). 


(OCCT, grades 3-8). 


math, science, and social studies). 


• New Mexico High School 


• End-of-course tests (algebra, 


• State-developed Alternative 


Competency Exam (NMHSCE). 


geometry, literacy). 


Assessment (SDAA II), the Texas 


• New Mexico English Language 
Proficiency Assessment (NMELPA). 




English Language Proficiency 
Assessment System (TELPAS). 


• New Mexico Alternate Performance 
Assessment (NMAPA). 




• Texas Assessment of Academic Skills 
(TAAS, being phased out as exit 
exam). 



New Mexico Alternate Performance 
Assessment (NMAPA) 


Oklahoma Alternate Assessment 
Program (OAAP) 


Texas Assessment of Knowledge and 
Skills— Alternate (TAKS-ALT) 


3-10 


3-8 and 10-12 


3-9 reading 


English language arts, math, science, 


reading, writing, math, science, social 


3-10 math 


writing 


studies 


4 and 7 writing 






10 English language arts 



Source: Arkansas Department of Education web sites (http://www.arkansased.org/students/assessment.html; arkedu.state.ar.us/actaap/index.htm; http:// 
www.arkedu.state.ar.us/actaap/student_assessment/student_assessment_p1 ,htm) and survey and interviews with department contacts (C. Marvel and T. 
Hicks); Louisiana Department of Education web sites (http://www.doe.state.la.us/lde/saa/2273.html; http://www.doe.state.la.us/lde/accountability/home. 
html; www.doe.state.la.us/lde/saa/2343.html) and survey and interviews with department contact (J. Johnson); New Mexico Public Education Department 
web sites (http://legis.state.nm.us; http://www.ped.state.nm.us/div/acc.assess/assess/info.update.corner.html; http://www.ped.state.nm.us/div/acc.assess/ 
assess/index.html) and survey and interviews with department contact (D. Farley); Oklahoma State Department of Education web sites (http://www.sde. 
state.ok.us/home/defaultie.html; http://www.lsb.state.ok.us; title3.sde.state.ok.us/studentassessment) and survey and interviews with department contact 
(A. Daugherty); Texas Education Agency web sites (http://www.tea.state.tx.us/student.assessment; http://www.tea.state.tx.us/curriculum.html; http://www. 
legis.state.tx.us; http://www.sos.state.tx.us/tac/index.shtml; http://www.tea.state.tx.us/student.assessment/resources/taksalt/index.html) and survey and 
interviews with agency contact (C. Wieland) . 
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TABLE C2 

Southwest Region state laws, regulations, rules, and administrative code for alternate assessment 



Arkansas Louisiana 



Arkansas Department of Education Rules and Regulations Act 
999 of 1999 established the Arkansas Comprehensive Testing 
and Accountability Program. 

Arkansas Code 6-41-217 governs individualized education 
programs, alternate assessment options b (3) (b) (3) (A) iii. 

Board rules and regulations: Arkansas ADC 005 19 006 
5.00-5.02, 5.02.5, 5.04-5.08. 

Arkansas Department of Education has adapted 34 CFR 
300.138, and 

• Has developed guidelines for the participation of children 
with disabilities in alternate assessments for children who 
cannot participate in state and districtwide assessment 
programs. 

• Has developed an alternate assessment system consisting 
of a portfolio assessment methodology, in accordance with 
the above guidelines, which was field tested during the 
spring semester of the 1999/2000 school year. 



Louisiana Administrative Code, Bulletin 118 establishes a 
comprehensive assessment system: 

LEAP Alternate Assessment, Level 1 (LAA 1) has been specially 
designed to evaluate the progress of students with significant 
disabilities. 

R.S.17:24(F)(4) mandates the assessment of all students in 
Louisiana public schools. 
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New Mexico Oklahoma Texas 



Per Title 6, chapter 31, part 2, 

§§6.31 .2.1 1 (E)(3)(a)-(c) of the New 
Mexico Administrative Code, students 
with disabilities for whom alternate 
assessments are appropriate under the 
department's established participation 
criteria may participate in alternate 
assessments; the individualized 
education program team must agree 
and document that the student is 
eligible for participation in an alternate 
assessment based on alternate 
achievement standards according to 34 
CFR §300.320(a)(6). 



Oklahoma Stat. tit. 70 Sec. 1210-10-508, 
Okla. Admin. Code §210:10-13-2 

Board policy (210:10-13-2) requires 
that districts include all students in 
state assessments, with appropriate 
accommodations when necessary. 
Alternate assessments are offered to 
students with the most significant 
cognitive disabilities. 



Section 39.023 of the Texas Education 
Code was amended by the 75th Texas 
Legislature to address the assessment 
of students receiving special education 
services: 

Texas Education Code, subtitle H, 
chapter 39, subchapter B. 



Source: Arkansas Department of Education web sites (http://www.arkansased.org/students/assessment.html; arkedu.state.ar.us/actaap/index.htm; http:// 
www.arkedu.state.ar.us/actaap/student_assessment/student_assessment_p1 ,htm) and survey and interviews with department contacts (C. Marvel and T. 
Hicks); Louisiana Department of Education web sites (http://www.doe.state.la.us/lde/saa/2273.html; http://www.doe.state.la.us/lde/accountability/home. 
html; www.doe.state.la.us/lde/saa/2343.html) and survey and interviews with department contact (J. Johnson); New Mexico Public Education Department 
web sites (http://legis.state.nm.us; http://www.ped.state.nm.us/div/acc.assess/assess/info.update.corner.html; http://www.ped.state.nm.us/div/acc.assess/ 
assess/index.html) and survey and interviews with department contact (D. Farley); Oklahoma State Department of Education web sites (http://www.sde. 
state.ok.us/home/defaultie.html; http://www.lsb.state.ok.us; title3.sde.state.ok.us/studentassessment) and survey and interviews with department contact 
(A. Daugherty); Texas Education Agency web sites (http://www.tea.state.tx.us/student.assessment; http://www.tea.state.tx.us/curriculum.html; http://www. 
legis.state.tx.us; http://www.sos.state.tx.us/tac/index.shtml; http://www.tea.state.tx.us/student.assessment/resources/taksalt/index.html) and survey and 
interviews with agency contact (C. Wieland). 
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TABLE C3 

Southwest Region state definition of significant cognitive disability 



Arkansas Louisiana 



The student's demonstrated cognitive functioning and 
adaptive behavior in the home, school, and community 
environments are significantly below age expectations, 
even with program modifications, adaptations, and 
accommodations. 

The student's course of study is primarily functional and life- 
skills oriented. 

The student requires extensive direct instruction or 
extensive supports in multiple settings to acquire, maintain, 
and generalize academic and functional skills necessary 
for application in school, work, home, and community 
environments. 

The student demonstrates severe and complex disabilities 
and poor adaptive skill levels (determined to be significantly 
below age expectations by that student's comprehensive 
assessment) that essentially prevent the student from 
meaningful participation in the standard academic core 
curriculum or achievement of the academic content standards 
established at grade level. 

The student's disability causes dependence on others for 
many, if not all, daily living needs, and the student is expected 
to require extensive ongoing support in adulthood. 

The student's inability to complete the standard academic 
curriculum at grade level is not primarily the result of the 
following: 

• Excessive or extended absences, poor attendance, or lack 
of instruction. 

• Sensory (visual or auditory) or physical disabilities, 
emotional-behavioral disabilities, or a specific learning 
disability. 

• Social, cultural, linguistic, or economic differences. 

• Below average reading level. 

• Low achievement in general. 

• Expectations of poor performance. 

• Disruptive behavior. 

• The student's IQ. 

• The anticipated impact of the student's performance on 
school or district performance scores. 

• The student's disability category, educational placement, 
type of instruction, or amount of time receiving special 
education services. 



To be eligible for participation in LEAP Alternate Assessment, 
the student shall: 

1. have a current multidisciplinary evaluation of the following 
exceptionalities: 

• Moderate mental disability. 

• Severe mental disability. 

• Profound mental disability. 

Or have an assessed level of intellectual functioning and 
adaptive behavior three or more standard deviations below 
the mean and the following exceptionalities: 

• Multiple disabilities. 

• Traumatic brain injury. 

• Autism. 
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New Mexico Oklahoma Texas 



The working definition of "significant 
cognitive disability" is supplied by 
the criteria for participation on a New 
Mexico alternate assessment: 

• Does the student's past and present 
performance in multiple settings 
(home, school, community) indicate 
that a significant cognitive disability 
is present? 

• Does the student need intensive, 
pervasive, or extensive levels of 
support in school, home, and 
community settings? 

• Do the student's current cognitive 
and adaptive skills and performance 
levels require direct instruction 

to accomplish the acquisition, 
maintenance, and generalization 
of skills in multiple settings (home, 
school, community)? 



A student is defined as having a 
significant cognitive disability through 
the completion of the Alternate 
Assessment Participation Checklist. 



The determination of significant 
cognitive disability is made by the 
Admission, Review, and Dismissal (ARD) 
Committee based on state education 
agency guidelines (see table C4). 



Source: Thompson et al., 2005. 
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TABLE C4 

Southwest Region state education agency guidelines for participation in alternate assessment 



Arkansas Louisiana 



Participation in the alternate assessments is determined by: 

• The student's individual education program (IEP) team, as 
documented in the student's IEP program. 

Or 

• The IEP team determines whether participation in the 
standard state assessment program is appropriate for 
students with lEPs. Students with disabilities for whom 
it is deemed inappropriate to take the standard state 
assessments (benchmarks and end-of-course) with the 
established accommodations participate in the Arkansas 
Alternate Portfolio Assessment System (AAPAS) following 
the guidelines established by the board. 



Students for whom the general statewide assessment is 
not appropriate. The Louisiana Educational Assessment 
Program (LEAP) Alternate Assessment, Level 1, is designed 
for students whose lEPs reflect significant modifications of 
the general education curriculum and have an emphasis on 
functional academic and life skills. A student participating in 
LEAP Alternate Assessment, Level 1, is progressing toward a 
certificate of achievement rather than a high school diploma. 
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New Mexico Oklahoma Texas 



The participation criteria for the New 
Mexico Alternate Assessments have 
become rule in the state of New Mexico. 
Sections 6.31. 2.1 1(E)(3)(a)-(c) of the New 
Mexico Administrative Code now require 
that IEP teams agree and document that 
the student is eligible for participation in 
an alternate assessment according to the 
following criteria: 

• The student's past and present levels 
of performance in multiple settings 
(home, school, community) indicate 
that a significant cognitive disability 
is present. 

• The student needs intensive, 
pervasive, or extensive levels of 
support in school, home, and 
community settings. 

• The student's current cognitive and 
adaptive skills and performance 
levels require direct instruction 

to accomplish the acquisition, 
maintenance, and generalization 
of skills in multiple settings (home, 
school, community). 



Does the student's disability result in 
substantial academic difficulties? 

Is the student's difficulty with general 
curriculum demands primarily due 
to the student's disability and not 
due to excessive absences unrelated 
to the disability, or social, cultural, 
environmental, or economic factors? 

Does the student's IEP reflect 
curriculum and daily instruction that 
focus on modified or 

alternate standards? 

Does the student have a significant 
cognitive disability? 

Does the student's demonstrated 
cognitive ability and adaptive 
behavior require substantial 
adjustments (alternate achievement 
standards) to the general education 
curriculum? 

Do the student's learning objectives 
and expected outcomes focus on 
functional application of skills as 
illustrated in the student's IEP goals, 
benchmarks, and objectives? 

Does the student require direct and 
extensive instruction to acquire, 
maintain, generalize, and transfer 
new knowledge and skills? 



The Admission Review and Dismissal 
Process (ARD) Committee may decide if 
the student can take Texas Assessment 
of Knowledge and Skills-Alternate 
(TAKS-ALT), if the student: 

• Requires support to access the 
general curriculum. Support 
may include assistance involving 
communication, response style, 
physical access, or daily living skills. 

• Requires direct, intensive, 
individualized instruction in a 
variety of settings to accomplish 
the acquisition, maintenance and 
generalization of skills. 

• Participates in the grade-level Texas 
Essential Knowledge and Skills (TEKS) 
through activities that focus on 
prerequisite skills. 

• Demonstrates knowledge and skills 
routinely in class by methods other 
than paper and pencil tasks. 

• Demonstrates performance 
objectives that may include real- 
life applications of the grade-level 
TEKS as appropriate to the student's 
abilities and needs. 



Source: Arkansas Department of Education web sites (http://www.arkansased.org/students/assessment.html; arkedu.state.ar.us/actaap/index.htm; http:// 
www.arkedu.state.ar.us/actaap/student_assessment/student_assessment_p1 .htm) and survey and interviews with department contacts (C. Marvel and T. 
Hicks); Louisiana Department of Education web sites (http://www.doe.state.la.us/lde/saa/2273.html; http://www.doe.state.la.us/lde/accountability/home. 
html; www.doe.state.la.us/lde/saa/2343.html) and survey and interviews with department contact (J. Johnson); New Mexico Public Education Department 
web sites (http://legis.state.nm.us; http://www.ped.state.nm.us/div/acc.assess/assess/info.update.corner.html; http://www.ped.state.nm.us/div/acc.assess/ 
assess/index.html) and survey and interviews with department contact (D. Farley); Oklahoma State Department of Education web sites (http://www.sde. 
state.ok.us/home/defaultie.html; http://www.lsb.state.ok.us; title3.sde.state.ok.us/studentassessment) and survey and interviews with department contact 
(A. Daugherty); Texas Education Agency web sites (http://www.tea.state.tx.us/student.assessment; http://www.tea.state.tx.us/curriculum.html; http://www. 
legis.state.tx.us; http://www.sos.state.tx.us/tac/index.shtml; http://www.tea.state.tx.us/student.assessment/resources/taksalt/index.html) and survey and 
interviews with agency contact (C. Wieland). 
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TABLE C5 

Assessment approaches and tasks 





Arkansas 


Louisiana 


New Mexico 


Approach 








to alternate 








assessment 


Portfolio 


Performance tasks 


Performance tasks 



Description 
of tasks 



The state selects the standards and 
student learning expectations to 
be measured, and teachers choose 
specific tasks. 

The portfolio can consist of three 
types of evidence related to the 
student learning expectations. 

Each piece of evidence should 
show students' performance on 
specific tasks that indicate progress 
in the general curriculum. Evidence 
can include: work sample or 
permanent products, captioned 
photographs, and videotape with 
a brief script that provides an 
objective and clear measure of 
what the student can perform. 



The state specifies target 
indicators, and teachers select 
activities. 

Louisiana Alternate Assessment 
(LAA 1) enables test administrators 
to assess students while they are 
engaged in their daily activities. 

For purposes of LAA 1, "activities" 
are defined as organized 
educational procedures designed 
to stimulate performance of 
the skills that will be assessed. 
Examples of activities include 
lunchtime, field experiences (such 
as a trip to a store or museum), or a 
math lesson. 

Target indicators from different 
content areas can be assessed 
during one activity. 



The state defines performance 
events of tasks to be used. 

The New Mexico Alternate 
Assessments are similarly 
constructed in terms of test design. 
They are on-demand assessments, 
meaning that the student has to 
perform the required elements 
at one particular point in time 
or during an event specifically 
developed for the purpose of 
administering the test. It is a direct 
observation assessment, which 
means that the test administrator is 
involved solely in observation and 
scoring the student's behaviors 
against the performance indicators 
as the student completes the 
activities that compose the 
assessment. The IEP team designs 
the activities, thus they are 
structured events. The checklist is 
the portion of the assessment on 
which the test administrator scores 
each indicator, with a range of 0-4, 
based upon the demonstrated 
ability. Once a behavior is observed 
and scored, it cannot be revisited. 

If the administrator is unable to 
observe a particular indicator 
being demonstrated, activities 
or portions of activities can be 
repeated to directly target that 
particular indicator. 
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1 Oklahoma 


Texas 


Portfolio 


Hybrid portfolio (portfolio and checklist) 



The state mandates two areas per grade per content area, and 
teachers select three areas to be measured. 

The Oklahoma Alternate Assessment Program (OAAP) is a 
performance-driven assessment. It is based on the Extended 
Academic Standards, which consist of specific domains, 
outcomes, standards, and benchmarks extended from Priority 
Academic Student Skills (PASS). Grades 3-8, and 10 outline the 
required subject areas to be assessed for the 2005/06 school 
year. The subject areas differ from grade to grade in order to 
meet No Child Left Behind requirements and Oklahoma state 
law governing the Oklahoma School Testing Program. Each 
subject area must have five pieces of evidence with support 
described. 

The teacher should answer and document the following 
questions about student tasks: 

• What is the student's performance or functional level? 

• What subjects or tasks can the student do with little or no 
difficulty? 

• Is the student mobile (ambulatory or nonambulatory)? 

• What is the student's communication ability and 
communication system used? 

• What is the student's attendance history? 

• What are the student's learning strengths (visual, auditory, 
tactile)? 

• What modifications, accommodations (including assistive 
devices and technology) and supports are provided? 

• What behavior interventions and positive behavior 
supports are used with this student? 

• What general supports and prompt hierarchy does the 
student need throughout the day? 

• How does the student interact with others in the school 
environment (natural supports)? 

• What IEP objectives or benchmarks address functional 
academics skills? 

• How does the student participate in different school 
environments? 

Points are awarded for each answer up to a total of 25. 

Source: Arkansas Department of Education web sites (http://www.arkansased.org/students/assessment.html; arkedu.state.ar.us/actaap/index.htm; http:// 
www.arkedu.state.ar.us/actaap/student_assessment/student_assessment_p1 ,htm) and survey and interviews with department contacts (C. Marvel and T. 
Hicks); Louisiana Department of Education web sites (http://www.doe.state.la.us/lde/saa/2273.html; http://www.doe.state.la.us/lde/accountability/home. 
html; www.doe.state.la.us/lde/saa/2343.html) and survey and interviews with department contact (J. Johnson); New Mexico Public Education Department 
web sites (http://legis.state.nm.us; http://www.ped.state.nm.us/div/acc.assess/assess/info.update.corner.html; http://www.ped.state.nm.us/div/acc.assess/ 
assess/index.html) and survey and interviews with department contact (D. Farley); Oklahoma State Department of Education web sites (http://www.sde. 
state.ok.us/home/defaultie.html; http://www.lsb.state.ok.us; title3.sde.state.ok.us/studentassessment) and survey and interviews with department contact 
(A. Daugherty); Texas Education Agency web sites (http://www.tea.state.tx.us/student.assessment; http://www.tea.state.tx.us/curriculum.html; http://www. 
legis.state.tx.us; http://www.sos.state.tx.us/tac/index.shtml; http://www.tea.state.tx.us/student.assessment/resources/taksalt/index.html) and survey and 
interviews with agency contact (C. Wieland). 



The state mandates three essence statements to be measured, 
and teachers choose two. Teachers decide on activities to 
measure all five tasks. 

Teachers use the state resources to develop assessment 
activities for students that reflect the instruction they 
have received on prerequisite skills linked to grade-level 
expectations. Students being assessed with TAKS-ALT can 
have whatever accommodations or supports the teacher feels 
are necessary for the student to be as independent as possible 
during the activity. The state provides suggestions and hints 
on how to develop a good assessment. 
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TABLE C6 

Southwest Region state-provided training on administration and use of results 



Arkansas Louisiana New Mexico 



Intensive regional training. 

State and regional staff provide training 
to district and school staff. 



Regional training. 

Louisiana State Department of 
Education provides training on 
administration and use of results. 



State staff and University of New 
Mexico professors provide training 
on alternate assessment. Training is a 
multiday model, including videotaped 
administrations in which the participants 
score case studies. 

Includes three training modules: 

1. Overview. 

2. Format and overview of clusters 
(what is measured). 

3. Scoring. 

Web-based training is also provided. 

The state and University of New Mexico 
have, in previous years, provided 
training on instruction for the severely 
disabled. 
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Oklahoma Texas 



The state and 20 regional centers provide training where 
modules and presentations are provided that discuss 
instructional and assessment decisions for students with the 
most significant cognitive disabilities. Modules are available 
online at http://pearson.learn.com/taksalt. 

There are four modules: 

• Topics include defining and explaining Texas Assessment 
of Knowledge and Skills— Alternate (TAKS-ALT) 
participation guidelines, defining access to grade-level 
curriculum, and a step-by-step process to access grade- 
level content and standards. 

• Topics include recording anecdotal notes and samples of 
student work, making fair observations, time management 
strategies, and effective planning for focused classroom 
observations. 

• Topics include TAKS-ALT scoring rubric, rating and 
expectations of students, evidence or data to be collected 
for the observation evaluation, and how to document 
observations. 

• Topics include descriptions of how to use the actual TAKS- 
ALT online assessment with system training simulations. 

Source: Arkansas Department of Education web sites (http://www.arkansased.org/students/assessment.html; arkedu.state.ar.us/actaap/index.htm; http:// 
www.arkedu.state.ar.us/actaap/student_assessment/student_assessment_p1 .htm) and survey and interviews with department contacts (C. Marvel and T. 
Hicks); Louisiana Department of Education web sites (http://www.doe.state.la.us/lde/saa/2273.html; http://www.doe.state.la.us/lde/accountability/home. 
html; www.doe.state.la.us/lde/saa/2343.html) and survey and interviews with department contact (J. Johnson); New Mexico Public Education Department 
web sites (http://legis.state.nm.us; http://www.ped.state.nm.us/div/acc.assess/assess/info.update.corner.html; http://www.ped.state.nm.us/div/acc.assess/ 
assess/index.html) and survey and interviews with department contact (D. Farley); Oklahoma State Department of Education web sites (http://www.sde. 
state.ok.us/home/defaultie.html; http://www.lsb.state.ok.us; title3.sde.state.ok.us/studentassessment) and survey and interviews with department contact 
(A. Daugherty); Texas Education Agency web sites (http://www.tea.state.tx.us/student.assessment; http://www.tea.state.tx.us/curriculum.html; http://www. 
legis.state.tx.us; http://www.sos.state.tx.us/tac/index.shtml; http://www.tea.state.tx.us/student.assessment/resources/taksalt/index.html) and survey and 
interviews with agency contact (C. Wieland). 



Provided annually in five regions. A technical assistance 
document is provided annually. 
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TABLE C7 

Alignment and standard setting in the Southwest Region states 



Component Arkansas Louisiana 



Alignment 
of alternate 
standards 
to alternate 
assessments 



Built content standards and achievement standards 
using horizontal and vertical alignment approach. 
Then, the assessment was linked to the standards. 



Built content standards and achievement standards 
using horizontal and vertical alignment approach 
by grade bands. Then, the assessment was linked to 
the standards. 



Overall alignment Webb WestEd 

methodology 

Standard setting Bookmarking (state-declared) reviewers mark Bookmarking (state-declared), 

methodology the spot in a specially constructed test booklet, 

arranged in order of item difficulty, where a desired 
percentage of minimally proficient or advanced 
students would pass the item, or standard setters 
mark where the difference between proficient 
and advanced performance on an exercise is a 
desired minimum percentage of students; reasoned 
judgment (a score scale, such as 32 points, is divided 
into a desired number of categories, such as 4, in 
some way, such as equally, larger in the middle, and 
so on; the categories are determined by a group of 
experts, policymakers, or others); and judgmental 
policy capturing (reviewers determine which of the 
various components of an overall assessment are 
more important than others, so that components or 
types of evidence are weighted). 

State scores on scoring rubric. 

Two test administrators observe and rate the 
student. The state specifies two target indicators 
for each participation level. The test administrator 
selects an additional three target indicators. The 
rubric has a scale of 1 to 6. 

Prior to scoring, a range-finding committee meets 
to establish scoring decisions and pulls exemplar 
papers. 

Teachers submit five entries for each strand in 
each subject area. There were a total of 20 target 
indicators in 2005/06: five in English language arts, 
five in math, six in social studies, and four in science. 

Scaled score ranges were used in the reporting 
of achievement levels. All content areas had a 
maximum of 340 scaled scores. Cutscores vary by 
content area and grade. 



Summary of State scores. 

scorm 9 Two readers score each portfolio on a 4-point scale. 

If the scores are not adjacent (such as 2 and 4), a 
third reader scores for resolution. 

Prior to scoring, a range-finding committee meets 
to establish scoring decisions and pulls exemplar 
papers in grades 3-8 and 11. 

Teachers must submit three entries for three English 
language arts strands, two entries for five strands in 
math and science, and two entries for three strands 
in social studies. 

The rubric is weighted by domain (performance, 
context, and level of assistance settings). English 
language arts has 540 total points, math has 600 
points, and science has 360 points. 

Scores for students with disabilities or English 
language learners participating in the Arkansas 
Alternate Assessment Program are reported by the 
state, district, and school in separate reports at all 
levels. 
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1 New Mexico 


Oklahoma 


Texas 


Built content standards and 
achievement standards using horizontal 
and vertical alignment approach. 

Then, the assessment was linked to the 
standards. 


Built content standards and 
achievement standards using horizontal 
and vertical alignment approach. 

Then, the assessment was linked to the 
standards. 


Built content standards and 
achievement standards using horizontal 
and vertical alignment approach. 

Then, the assessment was linked to the 
standards. 


Webb 


Webb 


Webb 



Body of work (state-declared) reviewers 
use a student's data to place the student 
in one of the overall performance levels; 
standard setters receive a set of papers 
that demonstrate the complete range 
of possible scores from low to high and 
reasoned judgment. 



Body of work (state-declared) and 
reasoned judgment. 



Body of work, reasoned judgment, and 
judgmental policy capturing. 



One person scores the performance of 
each student on a scale of 0-6. Each of 
four performance tasks in each content 
area is rated separately. 

The scoring rubric is summarized below: 

0 = Unable to perform. 

1 = Acquisition — student can perform 

20 percent of task. 

2 = Fluency building — student performs 

60 percent or more of task with 
or without minimal assistance or 
prompting. 

3 = Maintenance — student performs 

80 percent or more of the task, 
with very minimal assistance or 
prompting on a regular basis. 

4 = Generalization — student performs 

90 percent or more of task without 
prompting on a regular basis. 



A team of three to four special educators 
score the portfolio through the state. 
Each rubric uses a 4-point criterion. Ten 
percent of the portfolios are scored by 
a second team to establish inter-rater 
reliability. 

Five pieces of evidence with support 
proof (such as videotapes and work 
evidence) are required for each content 
area. 

The content rubric consists of a possible 
100 points: 

25 points for portfolio content. 

75 points for the evidence content 
rubric. 



The teacher scores the student portfolio 
using a rubric of 1-3 for demonstration 
of skill and 1-3 for level of support. If the 
student receives at least 2s or higher, 
then the student can be rated on the 
generalization of skill aspect of the 
rubric, receiving a yes (rating of 1) or no 
(rating of 0). 

Students are scored on three state- 
selected essence statements per content 
area being assessed. The teachers observe 
the students completing teacher- 
designed activities that link to the TAKS 
curriculum. The teacher also selects 
two additional essence statements and 
designs activities that are to be used. 

Raw scores are converted to proficiency 
levels. The assessment system was field 
tested in 2006/07. An alignment study 
and standard setting were scheduled for 
June 2007. 



Source: Arkansas Department of Education web sites (http://www.arkansased.org/students/assessment.html; arkedu.state.ar.us/actaap/index.htm; http:// 
www.arkedu.state.ar.us/actaap/student_assessment/student_assessment_p1 ,htm) and survey and interviews with department contacts (C. Marvel and T. 
Flicks); Louisiana Department of Education web sites (http://www.doe.state.la.us/lde/saa/2273.html; http://www.doe.state.la.us/lde/accountability/home. 
html; www.doe.state.la.us/lde/saa/2343.html) and survey and interviews with department contact (J. Johnson); New Mexico Public Education Department 
web sites (http://legis.state.nm.us; http://www.ped.state.nm.us/div/acc.assess/assess/info.update.corner.html; http://www.ped.state.nm.us/div/acc.assess/ 
assess/index.html) and survey and interviews with department contact (D. Farley); Oklahoma State Department of Education web sites (http://www.sde. 
state.ok.us/home/defaultie.html; http://www.lsb.state.ok.us; title3.sde.state.ok.us/studentassessment) and survey and interviews with department contact 
(A. Daugherty); Texas Education Agency web sites (http://www.tea.state.tx.us/student.assessment; http://www.tea.state.tx.us/curriculum.html; http://www. 
legis.state.tx.us; http://www.sos.state.tx.us/tac/index.shtml; http://www.tea.state.tx.us/student.assessment/resources/taksalt/index.html) and survey and 
interviews with agency contact (C. Wieland). 
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TABLE C8 

Review procedures and recent changes 



1 Procedure 


Arkansas 


Louisiana 


Special education advisory council/ 
committees 


Meets annually. 

Reviews policies and practices of 
alternate assessment and gives general 
advice on training needs. 


Meets quarterly. 

Reviews policies and practices of 
alternate assessment and gives general 
advice on training needs. 


Changes in alternate assessment for 
2007/08 and beyond 


Adding high school science in 2009/10. 
Alternate content and achievement 
standards are in progress. 


Changing standards from functional to 
academic. Building a new assessment 
system. 
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1 New Mexico 


Oklahoma 


Texas 


Meets on alternate months with 
additional meetings as needed. 

Reviews all aspects of alternate 
assessment, including items and scoring 
procedures. 


Meets semiannually. 

Reviews policies and practices of 
alternate assessment and gives general 
advice on training needs. 


Several separate advisory committees 
meet, but there is a core group on 
all committees (such as the testing 
overview committee meets monthly). 

Reviews all aspects of alternate 
assessments, including items. 


2005/06 Checklist 

2006/07 Checklist being phased out and 
performance standards being phased 
in. Completed data review and standard 
setting in June 2007. 

2007/08 New system operational. 


Standard setting. Teachers are shifting 
from functional skills to grade-level 
academic skills. 


Phasing out local choices and SDAA II. 
Moving to a portfolio system. In process 
of changing from functional skills to 
academic standards. 



Source: Arkansas Department of Education web sites (http://www.arkansased.org/students/assessment.html; arkedu.state.ar.us/actaap/index.htm; http:// 
www.arkedu.state.ar.us/actaap/student_assessment/student_assessment_p1.htm) and survey and interviews with department contacts (C. Marvel and T. 
Hicks); Louisiana Department of Education web sites (http://www.doe.state.la.us/lde/saa/2273.html; http://www.doe.state.la.us/lde/accountability/home. 
html; www.doe.state.la.us/lde/saa/2343.html) and survey and interviews with department contact (J. Johnson); New Mexico Public Education Department 
web sites (http://legis.state.nm.us; http://www.ped.state.nm.us/div/acc.assess/assess/info.update.corner.html; http://www.ped.state.nm.us/div/acc.assess/ 
assess/index.html) and survey and interviews with department contact (D. Farley); Oklahoma State Department of Education web sites (http://www.sde. 
state.ok.us/home/defaultie.html; http://www.lsb.state.ok.us; title3.sde.state.ok.us/studentassessment) and survey and interviews with department contact 
(A. Daugherty); Texas Education Agency web sites (http://www.tea.state.tx.us/student.assessment; http://www.tea.state.tx.us/curriculum.html; http://www. 
legis.state.tx.us; http://www.sos.state.tx.us/tac/index.shtml; http://www.tea.state.tx.us/student.assessment/resources/taksalt/index.html) and survey and 
interviews with agency contact (C. Wieland). 
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APPENDIX D 

STUDY METHODS AND LIMITATIONS 

The selection of methods for this study was in- 
formed by researchers’ experience studying alternate 
assessments (Sato, Rabinowitz, & Wilkins, 2007), 
by national studies investigating state policies and 
practices on alternate assessments (such as those 
by the National Center on Educational Outcomes; 
Thompson & Thurlow, 2001, 2003; Thompson et al., 
2005), and by studies on alternate assessment issues 
and practices in one or more states (such as Browder 
et al, 2002, and Browder et al., 2005; Thompson, 
Case, & Thurlow, 2000; and Thurlow & Case, 2004). 

The researchers investigated the challenges to de- 
signing and implementing alternate assessments 
and developed six questions to guide their study, 
using different methods to answer each: 

1. What challenges are states encountering when 
implementing new alternate assessment poli- 
cies and practices? (literature review, publicly 
available state documents, survey, interview) 

2. What do alternate assessments across the 
Southwest Region states look like? (publicly 
available state documents, survey, interview) 

3. What training or professional development 
is provided for teachers on alternate assess- 
ments? (literature review, publicly available 
state documents, survey) 

4. How are results collected and used at the state, 
district, school, and student levels? (literature 
review, publicly available state documents, 
interview) 

5. To what extent do state alternate assessments 
capture the same or similar skills as state tests 
designed for the general student population? 
(literature review, interview) 

6. What technical issues are states facing in 
developing and implementing reliable and 



valid alternate assessments? (literature review, 
survey, interview) 

Researchers looked for data that were reliable, 
valid, and targeted the research questions. 



Data collection 

Quantitative data collection involved state material 
audits, surveys, and interviews using descriptive 
techniques. Qualitative data collection consisted 
of semistructured surveys and a review of docu- 
ments, which researchers analyzed with coding to 
develop themes and categories. 

Step one was to collect state-specific materials on 
alternate assessments, including test administra- 
tor manuals; descriptions of laws, regulations, and 
policies; and descriptions of tasks. Step two was 
to ask states to send copies of nonsecure materi- 
als that could not be located on the web or that 
were being revised for 2007/08. Step three was to 
develop and administer a survey for staff in the 
five state education agencies. The survey was sent 
to the person in charge of alternate assessment 
for each state. In some instances states chose to 
have more than one person respond. Alternate 
assessment contacts from each state were asked 
by email whether they would prefer to fill out 
their responses independently or have a telephone 
interview. All opted to respond to the questions 
independently. (See appendix E for the survey 
instrument.) Step four was to interview represen- 
tatives from all five states by email or telephone 
to clarify information about each state’s alternate 
assessment system. 



Document analysis 

Before beginning the analysis, researchers read 
the available literature and state documents 
to refamiliarize themselves with each state’s 
alternate assessment systems. They organized 
the information into a matrix to compare state 
policies, practices, and procedures (see appen- 
dix C). They compared multiple data sources to 
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increase reliability. All findings had to be veri- 
fied by additional primary or secondary sources. 
For example, to verify the status of peer review, 
researchers used the Peer Review Status Letters 
sent to states by the U.S. Department of Educa- 
tion (2006a) and verified the information through 
interviews, information gained from professional 
meetings, or survey data. 

The study replicated the procedures used by 
Browder et al. (2005), a mixed-methods ap- 
proach to research first discussed in relation to 
this population by Creswell (2002). The approach 
includes qualitative and quantitative methods 
that, when systematically combined, provide 
rigorous, methodologically sound investigations 
in a range of fields (Creswell, Fetters, & Ivankova, 
2004). Triangulation and data transformation 
models add rigor. Browder (2006) and Ahlgrim- 
Delzell, Flowers, Browder, and Wakeman (2006) 
have successfully used the method for more than 
20 studies. Other researchers have been able to 
replicate their work. 

Following Browder et al. (2005), researchers began 
analyzing each state’s alternate assessment from 
an emic perspective— that of one who participates 
in the administration and development of a state’s 
alternate assessment — and retained the language 
of individual states to describe their alternate 
assessment systems. The researchers then shifted 
to an etic perspective — a more analytical perspec- 
tive where researchers compare phenomena — and 
developed common terminology for describ- 
ing commonalities and differences (Creswell, 
2002). Researchers identified criteria from the 
literature to analyze specific aspects of alternate 
assessments. 

The method used takes information gleaned from 
quantitative and qualitative data and merges the 
information to best understand the research topic. 



Study limitations 

Whenever evidence is collected and reviewed 
by a third party, ensuring the reliability of data 
collection and the validity of findings poses a 
special challenge. Researchers made every effort 
to conduct comprehensive searches for print and 
online information and to verify and clarify their 
findings with survey and interview methods (Cre- 
swell, 2002). Using an iterative process, research- 
ers reviewed print and electronic resources. They 
clarified and augmented this information with the 
survey results and by communicating through tele- 
phone and email. Despite these efforts, or perhaps 
because of them, researchers often found inconsis- 
tencies between required policies and their under- 
standing and implementation in state practice. 

Future studies would benefit from additional data 
verifying procedures, such as site visits, interviews 
with local education agency staff, or interviews 
with a broader group of state education agency 
staff— sources that were not available for this 
study— to obtain a more robust picture of states’ 
alternate assessment policies, practices, programs, 
and needs. Interviews with a formal protocol 
might be used to collect data more independently, 
rather than just to clarify survey findings. Full 
access to state technical reports would have been 
invaluable. The researchers did not explore in- 
depth questions about scoring techniques or the 
alignment of alternate assessment scoring linkages 
to general assessment. 

This study’s findings represent only a snapshot 
of states’ alternate assessment policies, practices, 
and programs at a particular time. Because the 
findings are based on a small sample of states 
linked only by geographic proximity, caution is 
warranted in drawing comparisons across states 
that are dissimilar in other meaningful ways and 
in generalizing beyond the sample. 
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APPENDIX E 

ALTERNATE ASSESSMENT INTERVIEW FOR 
THE SOUTHWEST REGION STATES 

Thank you for agreeing to complete these ques- 
tions. As I explained to you on the phone, this 
research study is not for accountability purposes, 
but for your regional laboratory (REL SOUTH- 
WEST/Edvance) to use as a “state of the states” for 
alternate assessment systems in your region. This 
information will go to your regional laboratory 
and will be shared with the states in your region to 
support or improve alternate assessment policies 
and practices. 



About the study 

What: This study focuses on examining state 
practices and policies for the most severely cogni- 
tively disabled student population (i.e., the lowest 
1 percent of students with disabilities) across five 
states in your region. 

How: The interview protocol includes 9 questions 
concerning the alternate assessment policies and 
practices of your state. It should take about 30 
minutes to complete. 

Outcomes: Your answers and those of the four 
other states will be collected and reviewed so that 
we can gather critical information on: 

a. common/effective elements of state policy re- 
lated to alternate assessments for the most se- 
verely cognitively disabled student population; 

b. critical elements affecting state policy (such as 
student demographics, teacher qualifications, 
stakeholder interests, resource availability); 

c. common/promising alternate assessment 
practices for this population of students; 

d. potential challenges for policy and practice 
implementation; and 

e. strategies for addressing such challenges. 



These outcomes will inform other states in 
your region that are facing similar influences 
affecting the implementation of state alternate 
assessment policy and practice. 



Interview questions 

We have looked at publicly available information, 
through the state web site, but need additional 
information, which we hope to glean through the 
following questions: 

1. What is the format of your state’s alternate as- 
sessment? (such as portfolio, checklist, rating 
scale) 

a. How did the state decide on this format? 

2. How congruent is the description of the in- 
tended population of students with disabilities 
and the actual population assessed? 

a. Does the local education agency or state 
education agency monitor this issue? 

3. As mentioned, we looked at web sites for 
information on programs, practices, and 
products related to alternate assessments: 

a. Have any polices or practices been added 
in the past six months? 

b. Are there other alternate assessment poli- 
cies or initiatives being considered by the 
state at this time? 

4. What are the particular alternate assessment 
priorities on which the state is focusing? 

For example, priorities might include: 

• access to general curriculum 

• alignment to curriculum or content 
standards 

• raising expectations 



APPENDIX E 



57 



• educator accountability 

• student demographics 

• resources 

• stakeholder interests 

5. What support is the state providing local educa- 
tion agencies? 

For example: Is training or professional develop- 
ment provided to teachers on alternate assessments? 



7. Flow are data being used and monitored at the 
state, district, and school levels? 



For example: 

Used 

• feedback purposes 

• program level 

• instruction for 
teachers 

• teacher training 



Monitored 

• type of assessment 

• student participation 
(who, where, etc.) 

• quality of assessment 

• administration and 
scoring procedures 

• overall progress 



6. Flow is the impact of the policies and/or practices 
being measured? That is, the outcomes or effects of 
the policies/practices upon which you base success. 

Examples might include: 

• increased assessment scores 

• increased student attendance 

• access to general curriculum 

• inclusion 



8. Is there technical evidence that supports a link 
between the state’s alternate assessments and a 
comprehensive curriculum (i.e., alignment to 
instruction and content standards)? 

For example (per peer review guide): validity, 
reliability, fairness/accessibility, comparability of 
results, procedures for test administration, scor- 
ing, data analysis, and reporting, interpretation 
and use of results. 

Thank you for your participation. If you have any 
questions, please feel free to e-mail or call WestEd 
directly. 



changes to SPED curricula & instruction 
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